Open schlitzered opened 4 years ago
I'd say it is. Just looking at the code one can see the acronym can be used as:
import nltk
from acronym.acronym import find_acronyms
ac.acronym.find_acronyms("Hello World", nltk.corpus.gutenberg, min_length=2)
Output:
Collecting word corpus
Identifying matching acronyms
Process Complete
long_version score
acronym
HOWL HellO WorLd 18
HEW HEllo World 15
HOOD HellO wOrlD 15
HOW HellO World 15
HELD HEllo worLD 13
HERD HEllo woRlD 13
HOLD HellO worLD 13
HOD HellO worlD 10
HOO HellO wOrld 10
HER HEllo woRld 8
HOR HellO woRld 8
HO Hello wOrld 5
One can change corpus
nltk.corpus.words
nltk.corpus.brown
nltk.corpus.gutenberg
Do not forget to change max and min length. In my example 5 was too long and the output was empty DataFrame.
hi, i find the tool pretty use full, and it would be nice if you could make this a lib, with a stable interface, that can be imported into other projects.
for this i would suggest that the logic to choose "corpus" should move into find_acronyms