Open yosi-dediashvili opened 10 years ago
Another normalization method: Replace any number with its string representation:
1
becomes one
10
becomes ten
20
becomes twenty
We should limit ourselves to normalization of up to the number 20. Past that, the numbers are converted to two words, so it get too complicated.
We need to add another normalization method that concatenates words from the title. The new method will come right after the 2nd normalization, and for will:
_
, concatenate them, and define it as normalized stringThis step will actually return a list of normalized strings and not a single ont.