-
This repository is an absolute gem for those diving into machine learning and working on innovative projects. 🚀 To make it even more powerful, I'd love to contribute by adding a real-time translation …
-
Bonjour, merci d'avoir partagé votre code et les différents modèles pré-entrainés.
J'ai téléchargé le corpus de Wikipédia et le premier modèle afin de faire tourner le notebook **lm-french-generation…
-
Hi there,
I am excited about this project. I have already used MBART50 from facebook to translate text from and to english and other languages and it was pretty solid translation.
We can augment…
-
I am trying to build a Abstractive PreSumm model for Korean
At the beginning, I used bert-multilingual model but I've found its tokenizer was poor so I've decided to use a sentecepiece which was tr…
-
Hi,
When trying to generate intermediate results with the following command:
```
dataset = 'tiny'
gpl.train(
path_to_generated_data=f"generated/{dataset}",
base_ckpt='sentence-transfor…
-
### Request for Adding ReazonSpeech's reazonspeech-k2-v2 Model
Hi, first of all, thank you for your excellent work on sherpa-rs!
I would like to request the addition of support for the **reazonspe…
-
**Is your feature request related to a problem? Please describe.**
The current logic of misspelling identification relies on `vocab.txt` from the transformer model. BERT tokenisers break not such com…
R1j1t updated
9 months ago
-
"File must be 200.0MB or smaller." when inputting a file over 200mb, I was hopping it can be transcribed/merged without showing the video if >200mb as I would assume thats the main issue, that the lib…
-
We currently only allow a subset of ASCII in usernames. It would be nice if speakers of languages other than English were more free in their choice, e.g. allow them to add äßé or a Username in Cyrilli…
-
Hello,
First of all, thank you for the great work!
I was excited to try out this powerful text segmentation model, so I tested it with both an English text and a translated Korean text.
However, …