-
Giorgia:
(Re)construing Meaning in NLP
Sean Trott | Tiago Timponi Torrent | Nancy Chang | Nathan Schneider
-
Wav2vec2 makes it possible for low resource languages to build high quality acoustic models using only unlabelled audio. Finetuning this with a couple of hours of labelled data gives you a pretty good…
-
# Welcome to the Common Voice Community !
> Common Voice aims to make speech technology accessible to everyone by building an open sourced dataset of labelled voice data that is representative of l…
-
i have a bunch of private unlabelled speech corpuses for Indian language families, hence given that its an obvious choice that i would want to continually pre-train the w2v-BERT2.0 model on my extende…
-
I think we need to explore multilingual models such as [wav2vec2-xls-r-300m-21-to-en](https://huggingface.co/facebook/wav2vec2-xls-r-300m-21-to-en) to see if the 300M models are better than the 53M mo…
-
Hi! Your research looks great, how much effort to adapt the code for another languages? Portuguese for example.
Thanks in advance!
-
In continuation of #13, there is a feature gap between the two offered search engines, and a resource gap in what resources are needed for providing the desired functionality.
While the list of sug…
-
I would like to run `meta_nmt5.py` as I wish to do some work on meta-learning for low resource languages.
I am unable to run the code as I could not find any standards showing what a `.tok` file shou…
-
Is the prompt used for content educational scoring part of this repo? Did you use Mixtral to score/classify content or was dedicated classifier trained?
zidsi updated
5 months ago
-
I'm training Fastspeech2 with multi-lingual TTS Dataset like below.
- Number of data : 300000
English(~44000) + Chinese(~80000) + Spanish(~30000) + Japanese(~7000) + Korean(~130000) : Total ~3000…