isawnyu / geocollider

Discover potential matches between multiple place databases
MIT License
5 stars 2 forks source link

Benchmark string normalization #11

Closed ryanfb closed 7 years ago

ryanfb commented 7 years ago

Related to #5. Initial implementation adds a significant amount of parse time to the full Pleiades data. If there's something that will give us an easy performance boost, it would help a lot.

ryanfb commented 7 years ago

This seems to have been caused by the StringNormalizer.normalizer_lambda implementation being wrong, now that it's fixed normalized parsing goes much faster and we shouldn't need to do this anymore: https://github.com/ryanfb/geocollider/commit/04ca43d65188df2af75e1bd247393e379f32f39b