Use spaCy to tokenize free text

mozilla / webcompat-ml

Webcompat machine learning models

Mozilla Public License 2.0

4 stars 3 forks source link

Use spaCy to tokenize free text #1

Open johngian opened 5 years ago

johngian commented 5 years ago

Currently we are concatenating all the issue bodies and titles to a corpus and tokenize our input based to that. I think it might improve our performance if use NLP to our data processing pipeline.