ppannuto / python-titlecase

Python library to capitalize strings as specified by the New York Times Manual of Style
MIT License
244 stars 36 forks source link

Wordlist filter feature issues #66

Closed ses4j closed 4 years ago

ses4j commented 4 years ago

I think there's some optimizations to be made with the new custom wordlist feature.

I upgraded packages wholesale and got a new titlecase. Out of the box, every time I run my app I get thousands of this message:

No config file found at C:\Users\user\.titlecase.txt [titlecase:79 create_wordlist_filter]

I can suppress the logging message, but I cannot stop titlecase from looking for a file I know to not be there. Also, if I DID use this file, it would very inefficiently have to find, open, and read this file upon every titlecase invocation.

So, a couple thoughts:

igorburago commented 4 years ago

I believe, this is essentially the same issue as the one I have filed in #58 and fixed in #60. The fix will be a part of the next release, which is quite imminent. (I have a couple of other minor bug fixes that I am about to file over the weekend, which @ppannuto kindly agreed to wait on in preparation for releasing the new version.)

As to the in-memory wordlist, there is no need to provide a separate interface. The user of the library can easily implement that by providing an appropriate callback, for instance, like so:

ignore_abbrevs = {'FBI', 'CIA', 'NSA'}
titlecase(text, callback=lambda w: ignore_abbrevs.get(w.upper()))
ppannuto commented 4 years ago

I believe @iburago is right. If there's something more that needs to be addressed, feel free to re-open this issue or create a new, more focused one.