rowheat02 / osm-gpt

https://osm-gpt.rohitgautam.com.np
MIT License
143 stars 26 forks source link

[Discussion] : Usage of Open Source LLM ? #1

Open kshitijrajsharma opened 1 year ago

kshitijrajsharma commented 1 year ago

I Liked your project , One similar project https://trident.yuiseki.net/ , https://github.com/yuiseki/TRIDENT which follows Gpt3 and overpass API and its pretty great actually

As we know charges for API usage will be massive , I was thinking if we can leverage power of open source LLM and can train them , Do you have any ideas to share on this topic ? I will be happy to contribute ! Since project still in development if we can use open source llm with training data we can create a good base Wanted to hear thoughts on this
cc : @rowheat02

konishon commented 1 year ago

I really like this idea

rowheat02 commented 1 year ago

@kshitijrajsharma Thank you for showing interest and offering to contribute! I really appreciate your enthusiasm. Your idea of using open-source LLMs to save on API costs is excellent. We can explore options like Meta's Llama2, a recently released LLM, or other advanced models.

I'd love to hear your ideas and suggestions on how we can acquire a sufficient amount of OSM query data to train the model effectively. Please share any potential sources or data collection methods, you think would be beneficial. Let's collaborate and make this project cost-effective!

kshitijrajsharma commented 1 year ago

We can get sufficient query , that won't be a problem at the end its a pair between language keywords with overpass query ! For the model itself My Initial Thoughts :
llama : Released By Meta https://ai.meta.com/llama/ I haven't tested this out yet intensively but looks like a promising model as they advertise , They say its free for researchers and commercial use but I am concerned with their license , Lot of people say its not open source : https://blog.opensource.org/metas-llama-2-license-is-not-open-source/ I have tried their LLaMA2-70B Model , Not good as chatgpt , and produces random query . I wonder if it can be retrained then this could work image

Another option is stable LM https://github.com/Stability-AI/StableLM

And this one binding of llma cpp in python that can be used as api : https://github.com/abetlen/llama-cpp-python

kshitijrajsharma commented 1 year ago

Update on LLama model training :

Here is the sample dataset that can be used to train LLama with RLHF

https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences

Now we need similar training dataset to be prepared for overpass questions and query ! There will be some challenges : We might search for training data , in order to retrain or play with it , it needs massive GPU and machine , I tried a demo with collab couldn't go through on free version

Two references : https://lightning.ai/pages/community/tutorial/accelerating-llama-with-fabric-a-comprehensive-guide-to-training-and-fine-tuning-llama/ https://huggingface.co/blog/stackllama

Looks like it can run on 8GB of GPU which is good a standard personal computer nowadays have this , but a Solid GPU and training dataset is needed ,Training dataset is something we can generate by asking community , bootstrapping overpass query examples , Challenge is the machine

rowheat02 commented 1 year ago

Thank you @kshitijrajsharma for sharing the sample dataset and references for training LLama with RLHF, as well as the insight into the challenges ahead. The stack-exchange-preferences dataset seems valuable, and it's promising that LLama can potentially run on an 8GB GPU. Although generating a similar training dataset for overpass questions will require effort, we are now equipped with the knowledge to proceed. I appreciate the progress made so far and believe we are ready to collect data for testing fine-tuning. Collaboration within the community will be vital in overcoming the challenge and achieving success. Let's continue to build on these findings and unlock LLama's full potential!

kshitijrajsharma commented 1 year ago

Yes ! There is a option for collecting data from the tool within though ! Need to check licensing for it ! Now I am looking at training datasets , let me know if you find any options

orkutmuratyilmaz commented 9 months ago

Hello and thanks for this beautiful repo:)

If you'd like to integrate a self hosted solution, please consider Ollama.

Best, Orkut