Closed ryankert01 closed 1 month ago
any progress
I'll open a pr by the weekends
Hi @ByronHsu , just noticed huggingface llama is mapped with based model, and yicoder has its base model configured. I think maybe we don't have to do a code change. I'll test it out shortly if it works. (not sure if I'm wrong)
ref:
UPDATE: got it, looks like it'll soon be solve by https://github.com/linkedin/Liger-Kernel/pull/199
Hi @ByronHsu , I just did the research, but I found an odd thing: when I only configure the SFTconfig with use_liger=True
, the GPU usage is same as not use liger, but if I use
model = AutoLigerKernelForCausalLM.from_pretrained(model_name)
it's significant better. it's not align with our sfttrainer docs on huggingface.
could you help me look into it? research notebook
@ryankert01 Thanks for the comment. #199 is ready and should get incorporated soon. Right now, the SFTConfig doesn't actually do anything with the use_liger flag unless you pass in a model path (and then it will load model using AutoLigerKernelForCausalLM) vs. an already instantiated model. After this change, will need to have SFTTrainer updated to call this new API.
🚀 The feature, motivation and pitch
to-dos:
Alternatives
No response
Additional context
from discord discussion