Open tytung2020 opened 1 year ago
are these lines of codes what is needed to make it work? cekal's amendment seems to work on the 7b version: https://huggingface.co/cekal/mpt-7b-peft-compatible/commit/a5eab52c1c61c1d50a4e01428949f6ff90c73c48 But not sure if it works fully as intended. Could someone in MosaicML check this? If so, please also implement this in the 30b version. Thanks~
Finetuning the mpt-7b and mpt-30b using qlora gives the error "ValueError: MPTForCausalLM does not support gradient checkpointing.". Is there a way to fix this?