IBM / text-generation-inference

IBM development fork of https://github.com/huggingface/text-generation-inference
Apache License 2.0
57 stars 30 forks source link

Improve log messages around the max sequence length #103

Closed maxdebayser closed 4 months ago

maxdebayser commented 4 months ago

Motivation

The existing messages were confusing to the users.

Modifications

In the router the error message was rephrased to make it more understandable for users who arent familiar with the internals.

In the server we now print the maximum possible sequence length limited by the model sequence length. The existing print was showing how much output tokens can fit into the memory if you pass max_sequence_length input tokens and vice-versa. I don't know what I was thinking when I wrote that.

Related Issues

https://github.ibm.com/ai-foundation/watson-fm-stack-tracker/issues/958