Open mudler opened 1 year ago
Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!
_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.
Don't engage in conversation with me, I don't support (yet) replying!
I apologize for the confusion earlier. It seems like you are looking for help with LocalAI, an open-source project for natural language processing. I can assist you in finding information and troubleshooting any issues you may have. Based on your prompt, it sounds like you are having difficulty with autoloading models in LocalAI. You would like to specify a specific backend for the model and enable prompt caching. To specify a backend for your models, you can create a model config file in your models directory specifying the backend:
name: gpt-3.5-turbo
backend: gptj
This will allow LocalAI to load the model using the specified backend when it is started.
To enable prompt caching, you can set the prompt_cache_path
and prompt_cache_all
options in your config file. For example:
prompt_cache_path: "cache"
prompt_cache_all: true
This will create a cache directory in the models directory where prompts will be cached for faster loading. If you have any further questions or issues, feel free to ask!
Sources:
Is your feature request related to a problem? Please describe. Currently autoloading models happens greedly, causing several issues and UX degraded experience
Describe the solution you'd like The autoloader should look at the magic headers and identify the appropriate backend to load the model
Describe alternatives you've considered Keep things as is
Additional context Looking only at the headers might be challenging due to the quantizations format moving fast. I'm not sure if that has some impact or not, but better be careful and check that before going to deep dive into this issue