GptNeoX fails to stop inference when encountering end_id

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

Apache License 2.0

5.78k stars 882 forks source link

Closed SeibertronSS closed 1 year ago

SeibertronSS commented 1 year ago

I accelerated GptNeoX with FT and deployed with Triton. But I found that GptNeoX can't stop inference when encountering end_id.

hmzo commented 1 year ago

hello, have you fixed this bug? @SeibertronSS

SeibertronSS commented 1 year ago

hello, have you fixed this bug? @SeibertronSS

Just use the latest FasterTransformer code