togethercomputer / OpenChatKit

Apache License 2.0
9.01k stars 1.01k forks source link

add lowrank training + inference #110

Closed xzyaoi closed 1 year ago

orangetin commented 1 year ago

@xzyaoi This looks good, however I was wondering if abstracting the code would be better for adding new models or using different datasets?

Here's a comprehensive example, modified from a HuggingFace example and also implements 8bit: https://github.com/orangetin/OpenChatKit/tree/peft/training/lora

Edit: This above also allows for optionally pushing to HF

xzyaoi commented 1 year ago

@orangetin this looks cool! I'd propose we keep both.

Here the intention of this PR is to create a minimal example that could help users get started. Hence it is meant to be short and simple enough with <100 LoC, where I see your fork is more powerful and more comprehensive.

Does this sound good to you? :) Many thanks for pointing it out!

orangetin commented 1 year ago

Here the intention of this PR is to create a minimal example that could help users get started. Hence it is meant to be short and simple enough with <100 LoC, where I see your fork is more powerful and more comprehensive.

@xzyaoi Makes sense :)

Does this sound good to you? :) Many thanks for pointing it out!

Sounds good!

madroidmaq commented 1 year ago

how to merge lora model to base model ?