EDIT: Feature list is different between the two versions, probably due to different lemmatization rules and stopword lists between NLTK and spacy. Hold off on merging until more testing is done.
What's New?
Removed pandas and nltk as dependencies; replaced functionality with already-included csv and spacy
Major restructuring of question_classifier.py
Performance improvements to NLP pipeline (QuestionClassifier now 10x faster!)
Renamed NIMBUS_NLP.py to variable_extractor.py
Moved functionality from NimbusNLP class in NIMBUS_NLP.py to nimbus.py
Type of change (pick-one)
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] This change requires a documentation update
How Has This Been Tested?
QuestionClassifier was tested on its own with a variety of inputs against the original QuestionClassifier to make sure they provided the same output. Also, end-to-end tests were performed at the end and seemed to work fine, answering several questions.
Checklist (check-all-before-merge)
formatting help: - [x] means "checked' and - [ ] means "unchecked"
EDIT: Feature list is different between the two versions, probably due to different lemmatization rules and stopword lists between NLTK and spacy. Hold off on merging until more testing is done.
What's New?
Type of change (pick-one)
How Has This Been Tested?
QuestionClassifier was tested on its own with a variety of inputs against the original QuestionClassifier to make sure they provided the same output. Also, end-to-end tests were performed at the end and seemed to work fine, answering several questions.
Checklist (check-all-before-merge)
formatting help:
- [x]
means "checked' and- [ ]
means "unchecked"[ ] I documented my code according to the Google Python Style Guide
[ ] I ran
./build_docs.sh
and the docs look fine[ ] I ran
./type_check.sh
and got no errors[ ] I ran
./format.sh
because it automatically cleans my code for me 😄[ ] I ran
./lint.sh
to check for what "format" missed[ ] I added my tests to the
/tests
directory[ ] I ran
./run_tests.sh
and all the tests pass