First, thank you for the great work!
I was playing with the t5 notebook in demo/generative-model. I build a docker image through Makefile, and run the notebook from the container.
I changed little things from the notebook, only a few printings. When I ran using t5-small it runs fine. But when I switched to use t5-large, the translation result in the Benchmark section becomes empty. I further printed out the tokens generated, and the results are
text generated by ONNX:
Onnx tokens:
tensor([0, 2, 0, 1], device='cuda:0')
which is obviously not correct.
I attach the notebook here for your reference. I doubt if there are any instability issues when you convert to fp16, since that method depends on the randomly generated data.
My experiment was running on a g5.2xlarge instance.
@brevity2021 we are working on adding support for t5 conversion using the convert script. I think it should cover the precision for different T5 models (including t5-large).
Hi,
First, thank you for the great work! I was playing with the
t5
notebook indemo/generative-model
. I build a docker image through Makefile, and run the notebook from the container.I changed little things from the notebook, only a few printings. When I ran using
t5-small
it runs fine. But when I switched to uset5-large
, the translation result in the Benchmark section becomes empty. I further printed out the tokens generated, and the results arewhich is obviously not correct.
I attach the notebook here for your reference. I doubt if there are any instability issues when you convert to fp16, since that method depends on the randomly generated data.
My experiment was running on a g5.2xlarge instance.