Hi,
I want to a add support for open source LLM models like Llama2 using langchain that will be self hosted in local and it will not be required to have gpu support to run. but GPU support will make it reply faster. In this way I want to reduce COST for chatgpt api.how is the idea?
Hi, I want to a add support for open source LLM models like Llama2 using langchain that will be self hosted in local and it will not be required to have gpu support to run. but GPU support will make it reply faster. In this way I want to reduce COST for chatgpt api.how is the idea?