-
Hi, I use the released NLLB checkpoint to decode flroes Chinese testset, overall the results looks good. However, I found that a lot of very common Chinese characters/tokens are missing from the dicti…
-
**The problem**
Many users have been using this feature to learn a language, to get their sentences translated, etc. However, some of them are not using this properly by choosing a wrong flag/langu…
-
Initially reported by: @drzax as #TRAC1402
Especially the $vocab->get_root_terms() function.
-
According to Martin Šeleng, the process is oudated and doesn´t match the current procedure.
https://vocabularies.cessda.eu/contentguide/translations.html
-
Hello! How can I apply my own data
-
See https://gramener.github.io/visual-vocabulary-vega/
-
Today I was going to train a gpt3_124m model, when I noticed that the max_seq_len is hardcoded [here](https://github.com/karpathy/llm.c/blob/d396cd18b71367f79cbaab8f8203e64e578f9ee8/train_gpt2.cu#L653…
-
Right now CountVectorizer sometimes sets ``self.vocabulary_`` outside of ``fit``. We usually prohibit this, but the common tests haven't reached the vectorizers yet.
-
## Issue
Originally tags were pitched as an internal form of organization for use in finding content and creating filtered views. Though the appearance of content will be addressed in https://githu…
-
The search widget doesnt index the vocabulary?
![LOCODES-for-AUSTRALIA-AU-Vocabulary-edi3-Standards-by-edi3 (1)](https://user-images.githubusercontent.com/13426392/94877966-6a9e4300-049f-11eb-865b-…