Open easydatawarehousing opened 4 months ago
I confirm the bug with chunks, once too big of an input is sent, I get `parse': 451: unexpected token at '' (JSON::ParserError)
, simply because the chunk ends in a way that the line is not a valid json.
Temperature and seed parameters should be part of 'options'
According to the docs temperature and seed should be passed as options:
In the current implementation these are passed at the same level as parameters like 'model'.
Changing code of Langchain::LLM::Ollama like this works, but is probably not the best place to implement this.
Non-streaming response chunks should be joined before parsing?
I am using Ollama 0.1.45. When requesting a non-streaming response (i.e. not passing a block to
chat
method) and the response is large (more than ~4000 characters) Ollama will send multiple chunks of data.In the current implementation each chunk is
JSON.parse
'd seperately. For smaller responses which fit in a single chunck this is obviously not a problem. For multiple chunks I need to join all chunks first and then JSON parse it.Changing code of Langchain::LLM::Ollama like this works for me.
Ollama docs say nothing about this behavior. Might be a bug in Ollama. Or a feature. This happens at least with llama3-8b-q8 and phi3-14b-q5 models. Should langchainrb code around this? Checking if response chunks are complete JSON documents or not.
Inherit from Langchain::LLM::OpenAI ?
Since Ollama is compatible with OpenAI's API, isn't it easier to let Langchain::LLM::Ollama inherit from Langchain::LLM::OpenAI ? Overwriting default values where needed.