LLukas22 / llm-rs-python

Unofficial python bindings for the rust llm library. 🐍❤️🦀
MIT License
71 stars 4 forks source link

Is streaming supported with langchain AsyncIteratorCallbackHandler? #32

Open AdrianLsk opened 12 months ago

AdrianLsk commented 12 months ago

I am getting no reponses when using with langchain callback AsyncIteratorCallbackHandler?

It gives only this warning


RuntimeWarning: coroutine 'AsyncCallbackManagerForLLMRun.on_llm_new_token' was never awaited
  run_manager.on_llm_new_token(chunk, verbose=self.verbose)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
LLukas22 commented 12 months ago

I don't think the current langchain wrapper support async calls, but it shouldn't be to hard to add, as the model.stream() call already unlocks the GIL internally while generating tokens. But you would have to ensure that the model is never used in parallel as that would probably create some sort of memory access problems or simply crash if you offloaded your model onto a gpu.

Do you know what function needs to be implemented by langchains LLM class to enable async processing?