Open ygrange opened 5 years ago
Hi,
This is a great addition but can result in some possible acronyms being missed. This is because the base code will only return one combination for each acronym even when there are many available. If it coincidentally returns an alternative combination that doesn't use one letter per word, then that particular acronym will not be shown.
For example:
> acronym --all-words "engineering dwarfs galaxy edge"
Collecting word corpus
Identifying matching acronyms
Process Complete
ENGAGED ENGineering dwArfs Galaxy EDge
ENRAGED ENgineeRing dwArfs Galaxy EDge
EAGLE Engineering dwArfs GaLaxy edgE
EDGED Engineering Dwarfs Galaxy EDge
... which misses the acronym "EDGE".
I went into the code thinking that this wouldn't be too complex to fix, but I think it would be a bit more work thatn I naively assumed :)
Ok. New version, this one does find EDGE for example. I basically use a completely different search strategy. I haven't really looked at performance (if relevant at all anyway).
This BTW adds a dependency to itertools. If I am not mistaken that is part of the default python libraries anyway.
This pull request adds an option to the tool that makes sure each acronym returned will contain letters from every word in the description. I figured that writing a feature request wouldn't take much shorter than implementing it and pull-requesting.
The only thing I am not really sure about is how to treat words like "a", "the", "in", etc. which in general would be ignored.