Open vkaul11 opened 6 months ago
Yes you can. It will produce the same loss. But it does not enable greater batch size in my experiments.
I am getting this error though when I do this. Any idea why ? File "/workspace/cookbook-internal/recipes/common/peft.py", line 89, in load_train_model model = prepare_model_for_kbit_training(model) File "/usr/local/lib/python3.10/dist-packages/peft/utils/other.py", line 137, in prepare_model_for_kbit_training model.gradient_checkpointing_enable(**gc_enable_kwargs) File "/workspace/cookbook-internal/recipes/common/sloth_activation.py", line 63, in new_gradient_checkpointing_enable assert gradient_checkpointing_kwargs == None AssertionError Maybe using QLora instead of Lora complicates things?
I need it reduce memory footprint not batch size
A question assert gradient_checkpointing_kwargs == None is there which throws an error. Do I need to set gradient_checkpointing_kwargs to something or I need to comment this line?
I was not clear about how to use the code ? https://github.com/jzhang38/EasyContext/blob/main/train.py#L28 By uncommenting this line we can enable sloth code?