Closed hwijeen closed 4 days ago
can you share your implementation? thanks!
@hwijeen +1
@hwijeen +1
Marking as stale. No activity in 60 days. Remove stale label or comment or this will be closed in 7 days.
Hi, sorry for the delayed response. I won't be able to make a PR as I no longer have access to the code, it was from my previous job. I'd be happy to work together as a side project if people are still interested.
@alex-ht @BrightXiaoHan @wsh2836741 Could you share what are your use cases? Are you trying to peft-tune GPT models for conditional generation tasks?
And may I ask what made you reopen this issue @jon-barker? Are you working on this?
hi @hwijeen
Are you trying to peft-tune GPT models for conditional generation tasks?
yes.
I found this tutorial, and it seems that NeMo can inject LoRA adapters into megatron-lm model. https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/lora.ipynb
Marking as stale. No activity in 60 days.
It would be great if Megatron-LM could support PEFT methods, e.g. QLoRA. We're sorely lacking a PEFT trainer with Tensor Parallelism.
Marking as stale. No activity in 60 days.
Hi, thank you for the great library.
Recently, many algorithms are proposed to replace fine-tuning as it incurs too many burdens, especially with huge models like GPT3. Examples include P-tuning and LoRA. I personally implemented both on top of Megatron-LM and was able to achieve SOTA accuracy on a number of Korean benchmark datasets (I used model size ranging from 300M to 82B).
How about supporting the algorithms like the above? I think it's an extension of the current
--fine-tune
option, and is a big plus in terms of the practicality of huge models.