Closed guillochon closed 6 years ago
$ ./acronym.py "Needs support for nested acronyms"
Finding Acronyms: 100%|████████████████████████████████████████████████████████████████| 100000/100000 [00:08<00:00, 11309.79it/s]
[...]
NAY Needs support for nested AcronYms
Can you give an example of input and expected result?
I guess, if text exceeds a certain length, search for short acronyms first, then replace those parts of the text with the new acronyms, then run the algorithm again on the new string.
Probably good to also have a dictionary of acronyms that have already been used and reuse those when possible.
Australian Square Kilometer Array Pathfinder Rotation Measure and Polarisation Investigation --> ASKAP Rotation Measure and Polarisation InvestigaTion --> ARMPIT
Successfully completed: now includes a document "data/existing_acronyms.txt" which will be crowd-source edited to include common acronyms in current use. In combination with new "--nested" flag, can now search for acronyms after replacing existing acronyms.
Example of new result: `acronym.py "Australian Square Kilometer Array Pathfinder Rotation Measure and Polarisation Investigation" --nested
[...]
ARMPIT Askap Rotation Measure and Polarisation investIgaTion
[...] `
Potential sources of acronyms:
Such as when an instrument name appears within another acronym.