pizzato / LLMMe

Create your own LLM and use it to respond to your email (gmail only now)
Apache License 2.0
66 stars 13 forks source link

recommendation about options other than h2o? #2

Closed koutkout closed 11 months ago

koutkout commented 11 months ago

Hi, Is there any "cloud" service that allows one to train other than h2o studio? What is the model that you recommend now best for answering emails? I guess opt is now old and maybe you can recommend something better? Thanks a lot. Hesham

pizzato commented 11 months ago

Any tool/framework that allows you to do fine-tuning should work as long as you can have the model work with the transformers library and (right now) AutoModelForCausalLM and AutoTokenizer.

The main thing about the model is how you can run them. You could fine tune a llama 70B but can you run them? Is it worth running them? Also, you probably don't need all the bells and whistles of the latest models as this is one model for a single task under a single data set (emails), fine tuning a very large model seems overkill.

I tried with the facebook/opt-125m and that seemed not to perform that well, but some of my models with 2.7b also weren't that great. I feel it's trial and error.