Closed shaz13 closed 6 years ago
Note: The nltk in the code uses wordnet and stopwords list which should be priorly downloaded in the environment using nltk.download('wordnet')
and nltk.download('stopwords')
. Are these persistent in Neptune environment or each run requires the download again?
@jakubczakon @kamil-kaczmarek Added functionality to choose using stopwords or not. Added nltk.downloads respectively. Awaiting decision on APPO dict placement. PTAL :)
Hey @shaz13, I have discussed APPO with @jakubczakon, and we have decided to do it in a clean way. That is:
external_data/apostrophes.json
. Note that it is json file,steps/preprocessing.py
you can load it to dict, using json module.When this is done, we are ready to merge.
Sorry for iterating this multiple times - we just want to maintain clean implementation ;-)
Thanks!
@shaz13 @jakubczakon merge done.