Closed SeibertronSS closed 1 year ago
I accelerated GptNeoX with FT and deployed with Triton. But I found that GptNeoX can't stop inference when encountering end_id.
hello, have you fixed this bug? @SeibertronSS
Just use the latest FasterTransformer code
I accelerated GptNeoX with FT and deployed with Triton. But I found that GptNeoX can't stop inference when encountering end_id.