Open-Source-Chandigarh / sadakAI

Personalized Roadmaps for Your Software Development Journey
GNU General Public License v3.0
3 stars 8 forks source link

Improve data caching for local-llm #16

Closed bakayu closed 3 weeks ago

bakayu commented 1 month ago

solved by #19 Currently it takes a lot of time to start the model, implement caching to improve loading speed. ps: please create a PR on the local-llm branch and not the main branch.

hriteshMaikap commented 1 month ago

Hey! I am interested to work on this! I downloaded the phi 1.5 model from settup.ipynb, did run sample queries as well (executed python main.py), a few questions.

  1. What kind of questions can it answer? I tried something from the train.json file and it did answer correctly. This will help me test a few queries.
  2. By caching what do you exactly imply? Where do you want to do this? I am assuming you want to implement caching in the main.py file itself, but then I guess the first query will always take time becuase are loading the LLM using litgpt. After that maybe caching can be implememted from second query onwards. Your elaboration can help me plan out my idea better! Looking forward to this contriubtion
bakayu commented 1 month ago
  1. What kind of questions can it answer? I tried something from the train.json file and it did answer correctly. This will help me test a few queries.

I noticed that the data provided in train.json is irrelavent to what we need the model to do(I probably forgot to remove it from the codebase during the experimental stage), I will look into this and probably add a README section clarifying what we want the model to do. What we actually want it to do is basically answer technical roadmap questions, for example a user can ask questions about how to start with frontend dev or backend dev, get recommended steps in learning the techs involved and such, and we will use the data from roadmap.sh for this.

  1. By caching what do you exactly imply? Where do you want to do this? I am assuming you want to implement caching in the main.py file itself, but then I guess the first query will always take time becuase are loading the LLM using litgpt. After that maybe caching can be implememted from second query onwards. Your elaboration can help me plan out my idea better! Looking forward to this contriubtion

please look into the local-llm branch, I first wanted to fine-tune a model to serve our purpose but then realised that knowledge based embedding is the optimal solution for our problem. In main_local/main_local.py, it is currently using FAISS as a vector database to map its embeddings and then comparing the prompts to it to return the output, it is much realiable for our use case and does not require for the model to be fine-tuned to our data. In here I want those embeddings to be cached so that whenever the model is launched again it does not need to map the embeddings again, I am sure there is a easy way to implement this, either via caching or simply saving the generated FAISS db.

As of now, we want local-llm to reach a stable state and merge it with main branch (entirely shift the focus from fine-tuning to knowledge based embedding)

If you want to know more about what I plan on doing with this project please reach out to me on discord (link in my profile)

hriteshMaikap commented 1 month ago

Okay great! A small mistake on my end was not checking the code in the local llm branch. I am familiar with using vector embeddigns and I am sure I can do something that is valuable.