-
### Willingness to contribute
No. I cannot contribute this feature at this time.
### Proposal Summary
This feature request proposes to add support for logging FullyShardedDataParallel models …
-
Dear author,
Thanks for your great work!
I found that vocab_size=152064 in config.json of checkpoint, so as the lm_head module.
However, when I print the len(tokenizer) it is 151657.
This ca…
-
### Bug description
I was able to fine-tune a 8B LLM using Huggingface training framework with PEFT+DeepSpeed stage 2 under fp16 precision(mixed precision training). Recently I would like to change…
-
### Issues Policy acknowledgement
- [X] I have read and agree to submit bug reports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md)
### W…
-
Hi doc-builder team!
Thanks for your great library!
I am trying to use your library to build docs for my own project, but I am facing some difficulties.
I have created a file structure for docs s…
-
I trained the PEFT model on my dataset, I used file finetune.py.
There is no difference between using and not using PEFT in the interface, so training with finetune.py does not make work.
I am v…
Oxi84 updated
9 months ago
-
Hello,
the fine-tuning process was done successfully, however when I try to run separate the inference by loading the code"
```
import torch
from transformers import AutoModelForCausalLM, Bits…
-
**Describe the bug**
from cosyvoice.cli.cosyvoice import CosyVoice
from cosyvoice.utils.file_utils import load_wav
import torchaudio
cosyvoice = CosyVoice('pretrained_models/CosyVoice-300M')
…
-
- [ ] auto hyperparameters search
- [ ] more rlhf methods support
- [ ] more model support
- [ ] the multimodal support
- [ ] auto-parallelizing
- [ ] better dispatcher and monitor
-
## Description
### Regression Test for Loss, Memory,
Throughput
Comparisons on loss, memory and throughput for Full-FT, PEFT
- QLoRA: status quo on the switch of `torch_dtype=float16` (Referenc…