Describe the bug
The ibm/granite-instruct preset for the IBMvLLM adapter is not specific to the model being used. I assume its ibm-granite/granite-3.0-8b-instruct and should be named as such to avoid confusion.
Assuming ibm-granite/granite-3.0-8b-instruct then the backend returns the incorrect max_sequence_length of 8192. The correct max_sequence_length is 4096
To Reproduce
Steps to reproduce the behavior:
Create an GraniteBeeAgent with IBMvLLM backend using preset ibm/granite-instruct
Breakpoint the meta method in the corresponding llm.ts
Result will indicate a max_sequence_length of 8192
Describe the bug The
ibm/granite-instruct
preset for the IBMvLLM adapter is not specific to the model being used. I assume itsibm-granite/granite-3.0-8b-instruct
and should be named as such to avoid confusion.Assuming
ibm-granite/granite-3.0-8b-instruct
then the backend returns the incorrect max_sequence_length of8192
. The correct max_sequence_length is4096
To Reproduce Steps to reproduce the behavior:
ibm/granite-instruct
meta
method in the corresponding llm.tsmax_sequence_length
of 8192Expected behavior
max_sequence_length
of 4096Screenshots / Code snippets
Set-up: