-
Some language pairs are oddly missing from ...WikiMatrix/list_of_bitexts.txt, against my intuitions on which ones would have more data and thus more matching sentences.
For example, Armenian (`hy`)…
-
Is it possible to use a pretrained XLM model created with the reference implementation (https://github.com/facebookresearch/XLM) with this code or do I have to retrain my models using the huggingface …
-
Is the prompt used for content educational scoring part of this repo? Did you use Mixtral to score/classify content or was dedicated classifier trained?
zidsi updated
7 months ago
-
-
### Summary of the new feature / enhancement
Need a way to enable only executing signed/trusted configurations. Need a cross-platform solution (ideally)
### Proposed technical implementation de…
-
**Is your feature request related to a problem? Please describe.**
Yes. Typically in group conversation of chat apps, we find that one of person says for example 'I got into university X' and there w…
-
# Welcome to the Common Voice Community !
> Common Voice aims to make speech technology accessible to everyone by building an open sourced dataset of labelled voice data that is representative of l…
-
# Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation
- authors: Benjamin Heinzerling, Michael Strube
- link: https://www.aclweb.org/anthology/P19-1…
himkt updated
5 years ago
-
- uid: unsupervised_cross_lingual_representation_learning_at_scale
- type: processed
- description:
- name: Unsupervised Cross-lingual Representation Learning at Scale
- description: This pap…
-
Hi, I noticed that for br, cy, mt, and ga (ru is fine), the translated multi-lingual sentences tend to be one sentence shorter than the original English ones. For example, one datapoint from the br tr…