Open samarthsarin opened 1 year ago
@merrymercy any solution for this? Please help
@samarthsarin Please provide more information (full stack trace); it is hard to help by only seeing an assertion error.
Hi @zhisbug @merrymercy
Here are all the steps I followed for the fine tuning. I am using train.py file for fine tuning using the dummy.json file which is provided in the readme section. I have limited capacity GPU memory (16GB) hence I cannot load the full model in the memory, so I have slightly modified the code in order to convert and load it in 8bits using peft and bitsandbytes. The full error is as follows:
──────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│
I am also attaching the jupyter notebook if you want to have a look. My env: Python 3.9 Cuda 11.7
While tuning I am getting the following error. AssertionError: No inf checks were recorded for this optimizer.
Can anyone help me with this? Here are my training arguments: per_device_train_batch_size=2, warmup_steps=100, num_train_epochs=3, fp16=True, logging_steps=1, output_dir='llama_output/', gradient_accumulation_steps = 2, evaluation_strategy = "no", save_strategy = "no", save_steps = 1200, learning_rate = 2e-5, weight_decay = 0., warmup_ratio = 0.03, lr_scheduler_type = "cosine", tf32 = False, gradient_checkpointing = False