k-int / gokb-phase1

Original GOKb repo - Moving to https://github.com/openlibraryenvironment/gokb
http://www.gokb.org
Other
11 stars 5 forks source link

Inconsistent method usage for string normalization #564

Closed hornmo closed 7 years ago

hornmo commented 7 years ago

It looks like two different methods are used to normalize strings (which are then used to look for possible duplicates): TSVIngestionService and KBComponentVariantName are still using GOKbTextUtils.normaliseString, while KBComponents and the TitleLookupService are using GOKbTextUtils.norm2

This may also explain #561 and #562

ianibo commented 7 years ago

This is fixed on the ebooks_pilot branch which is the baseline branch for v7.

hornmo commented 7 years ago

As it seems, the old method is still used in some places: