pelias / api

HTTP API for Pelias Geocoder
http://pelias.io
MIT License
221 stars 162 forks source link

unicode: fix bug preventing unicode normalization in some cases #1577

Closed missinglink closed 2 years ago

missinglink commented 2 years ago

I just caught a very subtle bug which has been in the codebase for 2 years now.

This easily confused variable name caused the effects of the unicode.normalize(raw.text) line above to be ignored 😱

We didn't notice this because it's implemented correctly in sanitizer/_text_pelias_parser.js but incorrectly in sanitizer/_text.js.

You can see that for the /v1/search endpoint it's possible to still send emoji to libpostal:

Screenshot 2021-11-02 at 10 32 11

I suspect that we have been falling back to using the pelias parser more often than needed because in these cases its more likely that libpostal will reject the string containing emoji.