The existing messages were confusing to the users.
Modifications
In the router the error message was rephrased to make it more understandable for users who arent familiar with the internals.
In the server we now print the maximum possible sequence length limited by the model sequence length. The existing print was showing how much output tokens can fit into the memory if you pass max_sequence_length input tokens and vice-versa. I don't know what I was thinking when I wrote that.
Motivation
The existing messages were confusing to the users.
Modifications
In the router the error message was rephrased to make it more understandable for users who arent familiar with the internals.
In the server we now print the maximum possible sequence length limited by the model sequence length. The existing print was showing how much output tokens can fit into the memory if you pass max_sequence_length input tokens and vice-versa. I don't know what I was thinking when I wrote that.
Related Issues
https://github.ibm.com/ai-foundation/watson-fm-stack-tracker/issues/958