kongds / MoRA

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
https://arxiv.org/abs/2405.12130
Apache License 2.0
341 stars 20 forks source link

Push to Hub #1

Closed alielfilali01 closed 6 months ago

alielfilali01 commented 6 months ago

Hi All, great paper btw that came in an extremely great time for me personally. I just want to inquire about the possibility to push the adapter to the hub, is it supported in your code base ? Also for the pretraining code, i would like to know what does the 250m means in the --pretrain 250m ? and since i see no column mapping, i believe the code expect data with text column only ... If you can please answer/confirm these points i would greatly appreciate it and again, great work, congratulations

kongds commented 6 months ago

Hello, thank you for your interest in our work.

It is possible to push our adapter to the hub, but it requires our PEFT library to load it.

Regarding the pretraining, 250M refers to the number of parameters in the model. We follow the ReLoRA preprocessing method for the C4 dataset (which can be found in its repository) and directly load the tokenized dataset in our training script.

alielfilali01 commented 6 months ago

Thanks for the quick response I'm just curious what do you mean exactly with this requires our PEFT library to load it ? like is there an argument i can pass to the training args like --hf_token "xxx" --push_to_hub true --push_private true ? Also now i'm thinking about merging the adapters back to the base, at first i thought it shouldn't be a problem, but now i think about the merge_and_unload() function expect an A and B matrices, so will it have problemes with MoRA ? I still think that it could be feasible but would love to hear a confirmation from assuming you already done it. Also if the current script support --merge_adapter argument. About the second point, I believe it is okay if i pass an untokinezed C4-like dataset, the script should tokenize it right ? at least that's how i think it is done by the original peft ...

kongds commented 6 months ago

We supports merging (we have modified merge_and_unload to merge correctly) and loading MoRA via our PEFT. But current training script provided does not support pushing to the hub or merging by flags. (sorry for it) You may add corresponding code or using other training scripts to achieve this. To load the pushed MoRA from the hub, it requires to use our PEFT library installed via pip install -ed ./peft-mora.

For the second point, we only support loading tokenized C4 datasets because tokenizing them can take a lot of time (I don't think it can be done by original PEFT . ). To tokenize, we use this script from ReLoRA.

alielfilali01 commented 6 months ago

Ok I see, Thank you,you have been really helpful to get a bigger picture about the current state of the code. I will close this issue now and maybe re-open it later if i got any further questions ... Thanks again 🤗