stucco / auto-labeled-corpus

Corpus of auto-labeled text for the cyber security domain
89 stars 32 forks source link

There are 1274 NaN's present in the place of word in full_corpus.json #7

Open Kirushikesh opened 3 years ago

Kirushikesh commented 3 years ago

For those 1274 NaN's in the place of the word the corresponding word is tagged as {'B-update', 'O', 'B-version',NaN}. Maybe more cleaning the corpus is required