neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

[TextGen] Make score and finished_reason fully optional #1418

Closed mgoin closed 10 months ago

mgoin commented 10 months ago

i built a new HF space and hit this error:

  File "/home/user/app/app.py", line 150, in generate
    for token in inference:
  File "/home/user/.local/lib/python3.10/site-packages/deepsparse/transformers/pipelines/text_generation.py", line 568, in _stream_engine_outputs
    generation = self._create_generated_text_output(
  File "/home/user/.local/lib/python3.10/site-packages/deepsparse/transformers/pipelines/text_generation.py", line 556, in _create_generated_text_output
    return GeneratedText(
  File "/home/user/.local/lib/python3.10/site-packages/pydantic/main.py", line 164, in __init__
    __pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
pydantic_core._pydantic_core.ValidationError: 1 validation error for GeneratedText
finished_reason
  Field required [type=missing, input_value={'text': '<s>', 'score': None, 'finished': False}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.5/v/missing

it seems to be because the else case for GeneratedText doesn’t include a finished_reason:

    def _create_generated_text_output(
        self,
        sequence: str,
        finish_reason: Optional[FinishReason] = None,
        logits: Optional[numpy.array] = None,
    ):
        if finish_reason:
            return GeneratedText(
                text=sequence,
                score=logits,
                finished=True,
                finished_reason=finish_reason.value,
            )
        return GeneratedText(
            text=sequence,
            score=logits,
            finished=False,
        )

it is an optional field, but guess since it doesn’t have a default value this is an issue? in this PR i added a default value of None to both scores and finished_reason, and this seems to work

class GeneratedText(BaseModel):
    text: str = Field(
        description="The generated sequence for a given prompt. If "
        "streaming is enabled, this will be the next generated token."
    )
    score: Optional[Any] = Field(
        description="The score for the generated token or sequence. "
        "The scores have the shape [sequence_length, vocab_size]"
    )
    finished: bool = Field(description="Whether generation has stopped.")
    finished_reason: Optional[str] = Field(
        description="The reason for generation to stop. "
        "Defined by FinishReason. One of stop, length, or time."
    )