Closed srivassid closed 3 weeks ago
Updating transformer engine within docker image and creating a new docker image with the changes solved the issue. Docker image pytorch-24.01 contains transformer engine 1.2.0, whereas the latest stable TE is 1.6.0
I have trained a model for 800 iters, just for testing purposes, and i am trying to perform inference on it, but the server crashes.
I have hosted a server, but when i try
tools/text_generation_cli.py localhost:5000
it asks me for a prompt but then the server crashes sayingAttributeError: 'InferenceParams' object has no attribute 'max_sequence_len'
Can anyone help me out?
Thanks