i-am-bee / bee-agent-framework

The framework for building scalable agentic applications.
https://i-am-bee.github.io/bee-agent-framework/
Apache License 2.0
1.09k stars 106 forks source link

`ibm/granite-instruct` preset for the IBMvLLM adapter is vague and resolves incorrect max_sequence_length #190

Open michael-desmond opened 1 day ago

michael-desmond commented 1 day ago

Describe the bug The ibm/granite-instruct preset for the IBMvLLM adapter is not specific to the model being used. I assume its ibm-granite/granite-3.0-8b-instruct and should be named as such to avoid confusion.

Assuming ibm-granite/granite-3.0-8b-instruct then the backend returns the incorrect max_sequence_length of 8192. The correct max_sequence_length is 4096

To Reproduce Steps to reproduce the behavior:

  1. Create an GraniteBeeAgent with IBMvLLM backend using preset ibm/granite-instruct
  2. Breakpoint the meta method in the corresponding llm.ts
  3. Result will indicate a max_sequence_length of 8192

Expected behavior max_sequence_length of 4096

Screenshots / Code snippets Screenshot 2024-11-21 at 10 24 22 AM

Set-up: