Open lawyinking opened 5 months ago
anything on this ? can we use quantized local models ?
What quantized models are you interested in using? In general, as long as you can spin up an openai compliant webserver endpoint from your model, you can integrate it into lida.
Most local models have tools that can spin up an openai compliant api, even llama.cpp.
Once you spin up the openai compliant api, you can simply just use it with lida (see the llmx api docs)
in Lida, is it possible to load the model downloaded on a MacBook M2? say the Zephyr 7b or Magicoder 7b, I downloaded the models on a Desktop folder, then I would like to load it using some simple method like llama-cpp-python, or Langchain's LlamaCpp, but not loading it using methods that Lida suggested, like llmx load model from hugging face, vllm server, OpenAI etc.
Thanks!