Open lepsch opened 7 years ago
While that may make perfect sense to us humans this is a very hard problem to solve without including a large acronym lookup dictionary for all the different acronyms that should be treated differently from the rules.
Without a lookup dictionary there is absolutely no difference between "AAAAAaaaaaaa" and "HTTPResponse".
@njalerikson @okunishinishi
is a very hard problem to solve without including a large acronym lookup dictionary for all the different acronyms
It should not be difficult to provide such a lookup dictionary, by just adding a sensible collection of common acronyms (embedded in code) to the library. Such not be too difficult to find. Still, the better solution should be to let the user pass his own list of acronyms.
More intricate is the actual implementation. Certainly, can be done via regex and groups. Maybe stringcase could be restructured as a class with the current functions as (class)methods. A list of strings could be passed to the init on instantiation._
A non-regex approach for finding acronyms in the context of string case conversions can be found here: https://github.com/jdc0589/CaseConversion/blob/master/case_parse.py
Essentially this method returns a list of words in PascalCase. These words can then be combined to give various cases. It should be easy to implement. The method either takes a list of strings as predefined acronyms (e.g. ["HTTP", "FTP"]) and if no such list is given has fallback method. This fallback method is not working with regex, as the one in the comment below. In case @okunishinishi wants to extend stringcase it is better to replace that fallback method with a pure regex approach.
Here is a pure regex approach, which does not mince runs of uppercase word in the fashion, as described. It also does not rely on a lookup dict. It would be trivial to implement.
This new package offers acronyme detection. https://github.com/AlejandroFrias/case-conversion
I think acronyms should be converted to one "word" only, eg. HTTPResponse should be converted to http_response.