-
Post questions here for one or more of our fundamentals readings:
Manning and Schütze. 1999. Foundations of Statistical Natural Language Processing. MIT Press:
Chapter 3 (“Linguistic foundati…
-
The new OMW version 1.4 includes the "-s" type in the identifier of many synsets:
```
4011 slk/wn-data-slk.tab
2649 slk/wn-data-lit.tab
1 nld/wn-data-nld.tab
```
Also, across the whole OMW 1…
-
_From @benjaminmillhouse on December 21, 2016 17:33_
- VSCode Version: Code 1.8.0 (38746938a4ab94f2f57d9e1309c51fd6fb37553d, 2016-12-13T17:38:28.425Z)
- OS Version: Darwin x64 16.3.0
- Extensions:
…
mjbvz updated
3 years ago
-
### Model description
SwahBERT: Language model of SwahiliIs a pretrained monolingual language model for Swahili.The model was trained for 800K steps using a corpus of 105MB that was collected fro…
-
Pytorch로 배우는 자연어 처리 1장, 2장 공부해오기
중요한 부분만 각자 정리해서 코멘트로 추가하기
* 1장 코드 [github 저장소](https://github.com/rickiepark/nlp-with-pytorch)
-
I would like to test a tokenizer, but this corpus does not include the original sentences. Would it be possible to include the original sentences?
Some of the sentences seem to come from here: ht…
-
For verbs with `Aspect=Imp|Mood=Ind|Number=Sing|Person=3|Tense=Pres` such as
```
120 é AUX
23 pode AUX
21 está AUX
20 está VERB
18 tem VERB
16 É AUX
15 é VERB
15 diz VERB
…
-
I suggest that we expand the section about the datasets that constitue the ERG treebanks: https://github.com/delph-in/docs/wiki/RedwoodsTop
Currently, the wiki page refers the reader to Flickinger …
-
- [EWT](http://match.grew.fr/?corpus=UD_English-EWT@dev&custom=61143c75d25bb&clustering=N.lemma)
- [GUM](http://match.grew.fr/?corpus=UD_English-GUM@dev&custom=61143d3f0cb01&clustering=N.lemma)
Ma…
-
I'm trying to package your module as an rpm package. So I'm using the typical PEP517 based build, install and test cycle used on building packages from non-root account.
- `python3 -sBm build -w --no…