-
**Describe the bug**
When defining a value for `repetition_penalty` & fine-tuning the model, predictions fail with the following error:
```
Prediction: 0%| …
-
```
Traceback (most recent call last):
2024-08-01T21:29:17.880522621Z File "/src/handler.py", line 6, in
2024-08-01T21:29:17.880527641Z vllm_engine = vLLMEngine()
2024-08-01T21:29:17.880533…
-
### System Info
- `transformers` version: 4.46.2
- Platform: Linux-6.1.85+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.24.7
- Safetensors version: 0.4.5
- Accele…
-
### Summary
The [mlflow.transformers.generate_signature_output](https://mlflow.org/docs/latest/python_api/mlflow.transformers.html#mlflow.transformers.generate_signature_output) function is an utilit…
-
With very large open models like SD3 medium and Flux.1 gaining popularity It's becoming comon to provide the diffusion model (unet/diffusion transformer) part of the model and the text encoders separa…
-
**What would you like to be added**:
Right now we can download model weights from model hub directly, but each time we start/restart a pod, it will downloading the model weights again. Without …
-
### Prerequisite
- [X] I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expe…
-
Hello,
i am getting this error constantly when trying to run the first code block in jupyter notebook or the gradio interface. I tried upgrading the packages separately, downgrading and installing a …
-
### Question
跑训练过程遇到 pos_embed算子输入的shape不匹配,请教大概是什么原因呢?我是pt2.3,其他以来版本是requirments.txt中内容
[rank5]: File "/torch/venv3/pytorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, …
-
### Feature request
I want to add L1/L2 regularization to the transformer training.
### Motivation
Adding L1/L2 reg can promote sparser models that can accelerate inference and reduce storage.
###…