monolingual-corpora Search Results

133 results
for monolingual-corpora

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

AI4Bharat/indicnlp_catalog #2

Monolingual Datasets for some indian languages

Hi, There are some monolingual corpora available for Indian languages at this [cite](http://wortschatz.uni-leipzig.de/en/download/) in **All Languages**, corresponding paper is [here](http://www.lrec…

deciphyre updated 5 years ago
1
AI4Bharat/indicnlp_catalog #4

Large twitter datasets for Telugu and Hindi

Hi, https://github.com/bedapudi6788/LOIT in this repo I added large twitter datasets for telugu (7.9 million) and hindi (17.6 million) and fasttext skipgram and cbow word vectors for the same.

bedapudi6788 updated 5 years ago
1
cltk/cltkv1 #10

Add fastText to cltkv1 (was: CLTK 2.0 fastText inclusion pro…

fastText is a new library to create vector models of words, it has been developed and released by Facebook AI team. https://github.com/facebookresearch/fastText https://fasttext.cc/docs/en/aligned-v…

todd-cook updated 4 years ago
24
ufal/lindat-kontext #160

links in corplist

This issue is about corplist. Even though the groups "LINDAT monolingual corpora", "LINDAT speech corpora" and "LINDAT parallel corpora" are not marked in any special way in corplist, they are listed…

Ansa211 updated 5 years ago
1
openai/gpt-2 #20

Translation task

What was the format for translation task? Do you provide sequence of pairs delimited by new lines, e.g. "sentence1 = translation_of_sentence1 \n sentence2 = translation_of_sentence2 \n ... \n testing…

djstrong updated 5 years ago
3
bheinzerling/bpemb #19

multilingual text

hey, thanks for sharing the code. I am working on the multilingual text. Can I give more than one language to segment words/sentences?

rohitsaluja22 updated 5 years ago
2
google-research/bert #633

Is multilingual model cross-lingual?

Hi, I wonder if bert multilingual representations can perform like other multilingual embeddings obtained by aligning monolingual embeddings (like [fastText multilingual](https://github.com/Babylonpar…

wanicca updated 5 years ago
2
ufal/lindat-kontext #214

corplist is slow and clumsy

- [ ] Loading corplist takes ages. - [ ] click Universal Dependency 2.3 -> unfolding the list of corpora takes ages - [ ] click on the name of your favourite language next to a corpus in that langua…

Ansa211 updated 5 years ago
3
bitextor/bicleaner #16

Training instructions lacking...

I downloaded the en-de model, and am now trying to replicate training. I had to make some guesses (e.g., how to specify the training data, what the switches -m and -c need), so I am running this …

phikoehn updated 5 years ago
9
yunsukim86/wbw-lm #1

Language Model Question

Hi Yunsu, Thanks for your excellent work. I am trying to repeat your result in the paper. I have a question about the Language Model. For training with kelnm, what corpus is used? Same with the …

BinWang28 updated 5 years ago
2

上一页 1...6 7 8 9 10 11 12...14 下一页

133 results for monolingual-corpora

133 results
for monolingual-corpora