Closed mihnita closed 11 months ago
Making (almost) all strings that contain language text unicode.
To be very pedantic we should probably also do .replace(u'"', u"'").replace(u"/", u" ") and all .split(u"/"), but they don't contain non-ASCII text, so it's OK.
.replace(u'"', u"'").replace(u"/", u" ")
.split(u"/")
This change should no longer be needed after https://github.com/google-research/url-nlp/commit/6ce0aec9c20aa2369ffbe6aede2fd1006143f693
Please open another PR if this (or another change) is still necessary.
Making (almost) all strings that contain language text unicode.
To be very pedantic we should probably also do
.replace(u'"', u"'").replace(u"/", u" ")
and all.split(u"/")
, but they don't contain non-ASCII text, so it's OK.