NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT
Apache License 2.0
5.78k stars 882 forks source link

GptNeoX fails to stop inference when encountering end_id #631

Closed SeibertronSS closed 1 year ago

SeibertronSS commented 1 year ago

I accelerated GptNeoX with FT and deployed with Triton. But I found that GptNeoX can't stop inference when encountering end_id.

hmzo commented 1 year ago

hello, have you fixed this bug? @SeibertronSS

SeibertronSS commented 1 year ago

hello, have you fixed this bug? @SeibertronSS

Just use the latest FasterTransformer code