Closed HenryYuen128 closed 8 months ago
Hello - could you try compiling with Torch-TensorRT 1.4 or main
, to see if the values have changed, since the TensorRT version and our converters will be updated in versions newer than 1.3. Additionally, try calling model.half()
prior to compilation to see if the accuracy results change.
@narendasan - could the accuracy issue be related to the subnormal weights/overflow?
Hello - could you try compiling with Torch-TensorRT 1.4 or
main
, to see if the values have changed, since the TensorRT version and our converters will be updated in versions newer than 1.3. Additionally, try callingmodel.half()
prior to compilation to see if the accuracy results change.@narendasan - could the accuracy issue be related to the subnormal weights/overflow?
Hi @narendasan @gs-olive, I have tried to compile with Torch-TensorRT1.4 and get the different result compared to running in FP32. When trying to call model.half() prior to compilation, raise the RuntimeError: expected scalar type Half but found Float.
The code snippet
model = torch.jit.load(path)
model.half()
model.to(device)
# The model needs to be in evaluation mode
model.eval()
# enabled_precisions= {torch.half} # run with 16-bit precision
# enabled_precisions = {torch.float32} # run with 32-bit precision
enabled_precisions= {torch.half}
trt_model = torch_tensorrt.compile(model, inputs=inputs, enabled_precisions=enabled_precisions,
truncate_long_and_double=True, require_full_compilation=False, debug=True
)
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days
got same issue
❓ Question
BERT Text Classification model run in fp16 gets huge different result compared to fp32
What you have already tried
Environment
conda
,pip
,libtorch
, source): pipTorch-TensorRT Version: 1.3
Additional context
Model converted from TouchScript to TensorRT
The logs