Open svengau opened 3 months ago
Hey @svengau,
Thanks for bringing this to our attention and for the detailed report! It seems like you've encountered an issue where the ChatVertexAI
class from @langchain/google-vertexai
is only returning the first chunk of the expected response. This behavior is indeed not intended for most use cases, as you'd typically want the full response to your query.
Your workaround using the streaming approach is a smart move and effectively bypasses the issue by manually concatenating the response chunks. This method ensures that you receive the complete response as intended. However, I understand that having to resort to such workarounds is not ideal.
I've taken a closer look at the issue, and it appears to be related to how the invoke
method handles response chunks. The method should ideally aggregate all chunks into a single response before returning, but it seems like it's currently stopping after the first chunk. This is something that needs to be addressed in the library's code.
While I cannot directly create a pull request or issue due to my limitations, I highly recommend reporting this behavior as a bug in the @langchain/google-vertexai
repository. When reporting, please include the details you've shared here, along with your temporary workaround. This will help the maintainers understand the issue fully and expedite a fix.
In the meantime, continue using your streaming workaround for a complete response. Keep an eye on the repository for updates and potential fixes to this issue. The LangChain.js community is quite responsive, so I'm hopeful a solution will be available soon.
If you have any more questions or run into other issues, feel free to reach out. I'm here to help!
Best, Dosu
Hmmm... that almost seems consistent, since Vertex only has streaming, but it should have combined things. And it probably shoudln't be returning a chunk.
@jacoblee93 - Please assign this to me.
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
N/A
Description
I've tried the provided example in the doc, and the
invoke
method always returns the first chunk of what could be a stream.As a workaround, I use:
System Info
langchain: 0.1.30 @langchain/community: 0.0.44 @langchain/core: 0.1.53 @langchain/google-vertexai: 0.0.2 @langchain/openai: 0.0.25 MacOS Sonoma 14.2.1 / Apple M1 Pro node: 20.10.0 yarn: 1.22.1