JuliaText / TextAnalysis.jl

Julia package for text analysis
Other
373 stars 95 forks source link

Use datadeps for AvgPerceptronTagger, add pos tagging over document types #166

Closed Ayushk4 closed 4 years ago

Ayushk4 commented 5 years ago

I have switched to using datadeps instead of storing weights locally, similar to the NER API. As of now, I am directly taking it from version control history, but perhaps it could be released as an asset.

Ayushk4 commented 4 years ago

Thanks for the review, I will make the changes ASAP.

Ayushk4 commented 4 years ago

@aviks I have made the suggested changes.

I also provided POS Tagger support over various Document types and String types, updated tests, docstrings and Documentation for the same.

Please review this (since there will be some merge conflicts with #167_Comment )

Ayushk4 commented 4 years ago

As of now, the AvgPerceptronTagger is taking the weights from the git version control history (link). It may be neater to release the file as an asset on GitHub (similar to MetalHead.jl).

Also, various other NLP libraries like SpaCy use avg perceptron tagger from POS. Maybe the model weights from these libraries along with one we currently provide could be tested, and then the best one could be made available from this package. Once this is done, we can release the weights.

aviks commented 4 years ago

release the file as an asset on GitHub

I have added the file (zipped) into a release on this package.

https://github.com/JuliaText/TextAnalysis.jl/releases/download/v0.6.0/pretrainedMod.bson.zip

Ayushk4 commented 4 years ago

I have changed the link to the Perceptrony Tagger weights.