-
any plan or example for mt5 support?
-
https://github.com/huggingface/transformers/blob/7bbc62474391aff64f63fcc064c975752d1fa4de/src/transformers/trainer.py#L3728
may be using these fixes it
```py
for param in model.parameters():…
-
I'm having an error building. It seems debianbullseye image is not available anymore.
```
[+] Building 0.8s (2/2) FINISHED …
-
how to download weights of google/mt5-xxl? seems it‘s extremely large(over 50 GB)
-
在opensora/serve/gradio_web_server.py 里引用了
`text_encoder = MT5EncoderModel.from_pretrained("/storage/ongoing/new/Open-Sora-Plan/cache_dir/mt5-xxl", cache_dir=args.cache_dir,
…
-
Hi,
I was running `unicamp-dl/mt5-base-en-msmarco`: ['▁no' , '▁yes'] model for both English and other My.TyDi languages, but the output scores are `nan`. When I switched to `unicamp-dl/mt5-13b-m…
-
https://github.com/user-attachments/assets/a1edd7c9-5015-4070-8de7-2ffed9860908
The command is:
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nnodes=1 --nproc_per_node 1 --master_port 29514 \
-m …
-
-
I'm trying to use the mT5 model, for mT5, tensorrt-llm build creates an engine for encoder and decoder, how should I organize the directory structure in this case? (In all models, there seems to be on…
hpk23 updated
10 months ago
-
It is currently unclear why some models appear before others when the page is sorted by classification (which is the default).
For example, why isn't XGen before OLMO?
Why aren't the models wi…