-
# 🚀 Feature request
Allow mBART and M2M100 to be easily fine-tuned with multiple target languages in the fine-tuning data set, probably by allowing forced_bos_token_id to be provided in the trainin…
-
I am working on 8 V100 16 GB GPUs and I am trying to train a 3.7B parameter mt5-xl model on the DPO training.
I managed to load both model and model ref as separate instances of mT5-xl in 8 bit. How…
-
怎么训练多语种模型?简繁体中文日文韩文英文等
-
Cool project! Wondering whether this should include MQL files for MT5? Thanks
-
Hi
First of all, thank you for your great work on this project. You've reached among best results on Spider benchmark and your clear and complete readme file allowed me to run your code very easily.…
-
Hey guys, I'm having a problem getting DeepSpeed working with XLM-Roberta. I'm trying to run it on an Amazon Linux machine, which is based on Red Hat. Here are a some versions of packages/dependencies…
-
If you haven’t already, check out our [contributing guidelines](https://github.com/Expensify/ReactNativeChat/blob/main/contributingGuides/CONTRIBUTING.md) for onboarding and email contributors@expensi…
-
I am trying to fine tune the mt5 model for grammatical error correction task using happy-transformers, and following the provided tutorial. However, it takes too much time ( approximately 12 hours !) …
-
**Source:** https://oscar-project.github.io/documentation/versions/oscar-2019/
**Description:** EDA + Clean Thai language part of OSCAR 2019.
**Note:** You might want to sample only a few amounts of…
-
**Source:** https://huggingface.co/datasets/mc4/
**Description:** Clean Thai language part of mC4.
- Gamble website
![Image](https://user-images.githubusercontent.com/56959186/227706424-53c5e556-7…