Closed campio97 closed 1 month ago
modify “cutoff_len” such as cutoff_len=4096
modify “cutoff_len” such as cutoff_len=4096
Why? Does cutoff_length cuts my records to the maximum set length ? I need a length far greater than 4096
large cutoff len needs larger VRAM, it cannot fit into 3 * A100 GPUs
I noticed that the issue is not so much about how many GPUs I use because the computation is parallelized, and the single GPU has 80GB of VRAM, which is not enough with a long context. So I tried to activate LongLoRA with S^2-Attn, but the logs show a message that it is not supported for this model (Mixtral8x22b-instruct). Is it correct that it is not supported, or is it a bug?
s2attn only supports llama model for now
Thank you
Hello, I am trying to fine-tune the mixtral-8x22b-instruct model but I keep getting the OOM error. I am using 3x A100 gpus for a total of 240gb of vram. I am using QLORA 4bit. After the first finetuning step it goes to OOM error.
My dataset consists of about 2000 records and they are all quite long texts, in some cases I think one record corresponds to about 30000 tokens.
Here is my "accelerate" configuration:
Here is my LLaMA-Factory configuration:
Error:
What am I doing wrong? Thank you in advance for your help