-
- 环境:
- WLS-2,Ubuntu22.04, 4090 GPU x1
- train_sft.sh
```bash
CUDA_VISIBLE_DEVICES=0 python dbgpt_hub/train/sft_train.py \
--model_name_or_path $model_name_or_path \
--quantizati…
-
### 🐛 Describe the bug
@record
def training_function(args):
# get some base rank info
# metric = evaluate.load("glue", "mrpc")
world_size = os.getenv("WORLD_SIZE")
rank = os.gete…
-
Gradient accumulation (micro step) could be very useful when we want to have large batch size but with limited number of gpus.
-
Hello,
The problem is that the parameter `w` or `"{'test': {'combine_ratio': 0.6}}"` in the readme.md doesn't seem to work when running inference.
I tried setting values 0, 1, 0.1, 0.9 and compa…
-
![NG@T{Q JDW3%OVV{5 {04OL](https://github.com/user-attachments/assets/188f0cbc-32e6-4a60-94ad-0b44fdd752a9)
When we perform multi-machine multi-GPU training, we are prompted with an out-of-memory err…
-
I see below warning in logs when running a LoRA training , can this be ignored ?
`/text-generation-webui-main/installer_files/env/lib/python3.11/site-packages/torch/utils/checkpoint.py:429: UserWar…
-
### Describe the issue
Issue: Not able to fine tune the LLaVA model with llava-v1.5-7b.
also I am sharing my arguments below here so when I am running the code it gives me error as
size m…
-
{'_default_root_dir': '/data/DJL/DiffAD-main',
'_fit_loop': ,
'_is_data_prepared': False,
'_lightning_optimizers': None,
'_predict_loop': ,
'_progress_bar_callback': ,
'_stochastic_weight_avg': …
-
I used this code and trained with Korean ko-snil data.
adapter_config.json, adapter_model.safetensors, special_tokens_map.json, tokenizer_config.json, tokenizer.json, tokenizer.model
5 files wer…
-
First, thank you for creating nanoGPT. It has been an amazing learning experience! I have a question about vocab size and training. I have built nanoGPT and ran the Shakespeare data with a vocab size …