JuliaText / CorpusLoaders.jl

A variety of loaders for various NLP corpora.
Other
32 stars 13 forks source link

Adding gmb dataset #39

Open tejasvaidhyadev opened 4 years ago

tejasvaidhyadev commented 4 years ago

Adding GMB Dataset. The dataset an extract from GMB corpus which is tagged, annotated and built specifically to train the classifier to predict named entities such as name, location, etc.

tejasvaidhyadev commented 4 years ago

Thankyou I will implement suggested changes(including Docs and tests ) soon

tejasvaidhyadev commented 4 years ago

Hi @oxinabox added some testsets by taking examples from other datasets.I don't know much about tests and i am still learning. let me know what else tests can be added.

tejasvaidhyadev commented 4 years ago

Hi @oxinabox For now I added only POS tagged of GMB As my project only need POS tags and i will also implement NER tags soon Thanks