bacook17 / acronym

ACRONYM (Acronym CReatiON for You and Me)
MIT License
383 stars 31 forks source link

Needs support for nested acronyms #1

Closed guillochon closed 6 years ago

guillochon commented 6 years ago

Such as when an instrument name appears within another acronym.

lgarrison commented 6 years ago
$ ./acronym.py "Needs support for nested acronyms"
Finding Acronyms: 100%|████████████████████████████████████████████████████████████████| 100000/100000 [00:08<00:00, 11309.79it/s]
[...]
NAY Needs support for nested AcronYms
bacook17 commented 6 years ago

Can you give an example of input and expected result?

guillochon commented 6 years ago

I guess, if text exceeds a certain length, search for short acronyms first, then replace those parts of the text with the new acronyms, then run the algorithm again on the new string.

Probably good to also have a dictionary of acronyms that have already been used and reuse those when possible.

Australian Square Kilometer Array Pathfinder Rotation Measure and Polarisation Investigation --> ASKAP Rotation Measure and Polarisation InvestigaTion --> ARMPIT

bacook17 commented 6 years ago

Successfully completed: now includes a document "data/existing_acronyms.txt" which will be crowd-source edited to include common acronyms in current use. In combination with new "--nested" flag, can now search for acronyms after replacing existing acronyms.

Example of new result: `acronym.py "Australian Square Kilometer Array Pathfinder Rotation Measure and Polarisation Investigation" --nested

[...]

ARMPIT Askap Rotation Measure and Polarisation investIgaTion

[...] `

guillochon commented 6 years ago

Potential sources of acronyms: