Closed auxon closed 1 year ago
Still working on this because if the cache is enabled, the full answer is returned even if tokens were streamed (due to cache miss). Need to figure out a good way to disable final answer output when tokens were streamed (cached missed) without introducing unwanted dependencies.
Still working on this because if the cache is enabled, the full answer is returned even if tokens were streamed (due to cache miss). Need to figure out a good way to disable final answer output when tokens were streamed (cached missed) without introducing unwanted dependencies.
@auxon maybe something simple like a flag? if on_new_llm_token
method is hit, you toggle the flag. if the flag is true, you can disable the final answer.
Still working on this because if the cache is enabled, the full answer is returned even if tokens were streamed (due to cache miss). Need to figure out a good way to disable final answer output when tokens were streamed (cached missed) without introducing unwanted dependencies.
@auxon maybe something simple like a flag? if
on_new_llm_token
method is hit, you toggle the flag. if the flag is true, you can disable the final answer.
Yes, I think this is the simplest and perhaps the only way to really solve it without introducing unneccessary dependencies, so I will try that ASAP today.
Cool! No rush!
Ran into an issue trying to add a field to the base AsyncLanarkyCallback, which seems due to pydantic 1.10.12 (as it works fine with the latest pydantic. Here is a minimal repro:
from pydantic import BaseModel
class Base():
pass
class Derived(Base, BaseModel):
def __init__(self, **data):
super().__init__(**data)
print("Derived.__init__() called")
self._myVar = True
@property
def myVar(self):
return self._myVar
@myVar.setter
def myVar(self, value):
self._myVar = value
class SubDerived(Derived):
pass
if __name__ == "__main__":
v = SubDerived()
print(v.myVar)
If you use pydantic 1.10.12 which Langchain and Lanarky use now you get this exception:
Exception has occurred: ValueError (note: full exception trace is shown but execution is paused at: <module>)
"SubDerived" object has no field "_myVar"
File "/Users/rah/repos/pythonTests/inheritanceTest.py", line 15, in __init__
self._myVar = True
^^^^^^^^^^^
File "/Users/rah/repos/pythonTests/inheritanceTest.py", line 31, in <module> (Current frame)
v = SubDerived()
^^^^^^^^^^^^
ValueError: "SubDerived" object has no field "_myVar"
Fixed the issue trying to add a settable property by simply adding it like:
class AsyncLanarkyCallback(AsyncCallbackHandler, BaseModel):
"""Async Callback handler for FastAPI StreamingResponse."""
llm_cache_used: bool = Field(default_factory=lambda: langchain.llm_cache is not None)
...
Unfortunately it took way to long to resolve. LOL
Closing this to make a new one.
Description
Async streaming callbacks were not returning the final answer when langchain.llm_cache was set, and not streaming tokens because the answer was cached. This PR fixes the issue and updates some tests to check when caching is enabled, that the answer is returned by the callbacks.
Fixes # 133
Changelog:
Fixed Async callback handling when streaming and cache is enabled to return final answer.
Added settings.json and .flake8 config to .gitignore.