Open yukiarimo opened 3 months ago
Something like this could work for Ollama. I don't have SD running and I'm less familiar with the API so I commented out that and the epub stuff, but here's a start:
https://gist.github.com/sqrt10pi/dfca547ee263328cafb55947458a0838
The main annoying part is that ollama doesn't support max_tokens
so I instead timeout the request after 5 minutes and try again (this is my try_until_successful
function). I think depending on the model you're using in ollama you can use the num_predict
model option but I only have dolphin-mixtral installed which doesn't have support.
Here's an example output using dolphin-mixtral: https://gist.github.com/sqrt10pi/20de2db367a78b63cd7b3719ad19937a (definitely not perfect)
ollama doesn't support max_tokens
Then, I think KoboldCPP API is a very good alternative to Ollama
I want to use this locally with Ollama and SD (or anything else). Is it possible?