Open supermeng opened 4 months ago
When looking at your reproducer, I noticed that you had truncate_long_and_double
enabled earlier but have it commented out? When I try running it through the torchscript frontend with that feature enabled on main
seems like it works fine? Also if you are tracing to work around torchscript limitations you might want to use the dynamo frontend, if you still need torchscript at the end, you can torch.jit.trace
the output of torch-tensorrt
but you will have access to all the latest features we have been adding.
@narendasan Hiļ¼thanks so much for your reply, If I enable the truncate_long_and_double
there would be another dtype dismatch(float with half) error. And what confused me is that there are no double dtype in all tensor calculations. Also it would cost much more time than eager
or torch.jit.trace
mode when I am using dynamo
frontend.
@narendasan Hiļ¼thanks so much for your reply, If I enable the
truncate_long_and_double
there would be another dtype dismatch(float with half) error. And what confused me is that there are no double dtype in all tensor calculations. Also it would cost much more time thaneager
ortorch.jit.trace
mode when I am usingdynamo
frontend.
There may be int64 types in your code (including things like index) which require the use of that setting.
Unable to freeze tensor of type Int64/Float64 into constant layer, try to compile model with truncate_long_and_double enabled
When I try to test the Transformer Attention layer with tensorRT, I get the error above. I do checked both the sample and input tensor and the inputs for trt.compile, there are no double tensor.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Code run correctly
Environment
conda
,pip
,libtorch
, source): pipAdditional context
tensor_rt_attn.log