nltk nltk_data issues - Githubissues

nltk / nltk_data

NLTK Data

1.4k stars 1.03k forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Adds Biden's inaugural address

#169 nimbusaeta closed 2 years ago
1
Updated `[0]VP(eva` to `[0] VP(eva` in sinica_treebank, see nltk/nltk#2467

#168 tomaarsen closed 2 years ago
1
put back missing stopwords

#167 ibnubay closed 2 years ago
2
Indonesian and others stopword is missing

#166 ibnubay closed 2 years ago
1
Wordnet31

#165 ekaf closed 2 years ago
1
Add a script

#161 luxinyu1 closed 2 years ago
0
Fix WordNet 3.0 gloss inconsistencies

#160 genericallyterrible opened 2 years ago
5
Bengali Stopwords is missing

#159 Nirzak closed 2 years ago
0
Hebrew stop words

#158 wzeyal closed 2 years ago
1
names.zip, names.xml: adding more names

#157 davidam closed 2 years ago
14
Added bengali stopwords

#156 Nirzak closed 2 years ago
4
corpus

#155 Ebaba2021 closed 2 years ago
2
NLTK ERROR when download punkt

#154 mengxun1437 opened 3 years ago
2
add more arabic stop words

#153 12mohaned closed 2 years ago
2
NLTK Data Delivery

#152 gaetan-dion opened 3 years ago
0
murciélago , spanish for "bat" is not found in wordnet (omw)

#151 JamesArthurHolland opened 3 years ago
2
downloader error what do I do?

#150 havesthdone closed 2 years ago
2
Stopwords licensing

#149 dikshashree closed 2 years ago
1
License of Punkt Tokenizer Models and Stopwords Corpus

#148 rahulmohang opened 4 years ago
0
Agape

#147 agpalma59 opened 4 years ago
0
Add ARCOSG (Annotated Reference Corpus of Scottish Gaelic)

#146 razorfish17 opened 4 years ago
1
Q: Is there any reason that twitter samples(in corpora) are held in full form, with all meta data - not only text?

#145 mkantautas opened 4 years ago
0
Add Malayalam language for PunktSentenceTokenizer()

#144 sabiqueqb closed 2 years ago
2
pan18

#143 futurespoir opened 4 years ago
0
Upgrade CMUDict to 0.7b

#142 begeekmyfriend closed 3 years ago
2
Update stopwords for Portuguese language

#141 davialvb closed 2 years ago
3
nltk_data compatibility with Windows

#140 benhuff closed 4 years ago
0
Slovene stowords have additional spaces at the end of each word

#139 PrimozGodec closed 4 years ago
2
How to add new langauge

#138 mhf-ir closed 4 years ago
1
Chinese simplified stopwords

#137 MangoPomelo closed 2 years ago
4
[nltk_data] Error with downloaded zip file

#136 ghost closed 4 years ago
3
Add 2 last inaugural speeches

#135 nimbusaeta closed 5 years ago
1
Correction of missing new spelling rules in german stopwords

#134 ndaheim closed 5 years ago
0
New spelling rules missing in german stopwords

#133 ndaheim closed 5 years ago
1
tajik stop words

#132 RVositov closed 5 years ago
0
Update build_pkg_index.py

#131 AishwaryaVarma closed 5 years ago
1
Two words misspelled in Spanish stopwords

#130 Palmero97 closed 5 years ago
0
Broken link in brown README

#129 pyfisch opened 5 years ago
0
Use unzipped files to facilitate contributions

#128 ArthurClemens closed 4 years ago
4
Update Slovene wordnet to sloWNet 3.1

#127 ArthurClemens closed 5 years ago
1
Verbnet all lemma list includes NNPs

#126 tttthomasssss opened 5 years ago
0
Update index to the latest dataset

#125 alvations closed 5 years ago
0
Verbnet identifier in index.xml mismatch

#124 alvations opened 5 years ago
1
Stale pickled PunktSentenceTokenizer in nltk_data/

#123 advgiarc opened 5 years ago
0
Tatoeba corpora

#122 MohammedBelkacem opened 5 years ago
0
Cora

#121 seifaissa closed 5 years ago
0
Hinglish and Hindi stop-words

#120 TrigonaMinima closed 2 years ago
4
How can I get the raw data of English Proper Nouns?

#119 jxlwqq closed 5 years ago
1
Add Russian language for PunktSentenceTokenizer()

#118 Mottl closed 5 years ago
7
error when trying to import panlex_swadesh

#117 lingdoc opened 6 years ago
2

Previous Next