Closed jibf closed 3 months ago
Using our layernorm fp32 plugin, the precision of model is normal. Using fp16 myelin, the precision drops more than 20%+.
--layerPrecisions=LayerNormalization_*:fp32
Could you please try expand the "*", I'm not sure whether we support this kind of wildcard. You should be able to find the FP16 layernorm name from verbose log, which trt should give you a warning that running layernorm under fp16 will affect accuracy.
closing since no activity for more than 3 weeks, pls reopen if you still have question, thanks all!
I'm having the same problem. Have you solved it?
I want to modify the precision of the Add_3244 node to fp32, but it's wrapped in Myelin. Commands or scripts: trtexec \ --onnx=$onnx_path \ --saveEngine=$engine_path \ --plugins=$plugins_path \ --verbose --workspace=2048 \ --exportProfile=${engine_path}.profile.json \ --exportLayerInfo=${engine_path}.graph.json \ --profilingVerbosity=detailed \ --fp16 \ --precisionConstraints=obey \ --layerPrecisions=Add_3244:fp32 --layerOutputTypes=Add_3244:fp32 @zerollzeng
I want to modify the precision of the Add_3244 node to fp32, but it's wrapped in Myelin. Commands or scripts: trtexec --onnx=$onnx_path --saveEngine=$engine_path --plugins=$plugins_path --verbose --workspace=2048 --exportProfile=${engine_path}.profile.json --exportLayerInfo=${engine_path}.graph.json --profilingVerbosity=detailed --fp16 --precisionConstraints=obey --layerPrecisions=Add_3244:fp32 --layerOutputTypes=Add_3244:fp32 @zerollzeng
@Smarter-version Perhaps you can try setting specific layers as output layers. Since TRT's output layer must be fp32, this indirectly achieves the goal of setting these layers to fp32.
I want to modify the precision of the Add_3244 node to fp32, but it's wrapped in Myelin. Commands or scripts: trtexec --onnx=$onnx_path --saveEngine=$engine_path --plugins=$plugins_path --verbose --workspace=2048 --exportProfile=${engine_path}.profile.json --exportLayerInfo=${engine_path}.graph.json --profilingVerbosity=detailed --fp16 --precisionConstraints=obey --layerPrecisions=Add_3244:fp32 --layerOutputTypes=Add_3244:fp32 @zerollzeng
@Smarter-version Perhaps you can try setting specific layers as output layers. Since TRT's output layer must be fp32, this indirectly achieves the goal of setting these layers to fp32.
Thank you for your reply!I tried to use the output of some layers as the output of onnx model and then convert it to engine model, but it fails with the error "No value info found for tensor".
Description
I tried to convert an onnx to trt on A100 with layernorm set fp32 specificly. But the whole transformer block was wrapped into a Myelin layer in which final precision of layernorm was still fp16.
detailed log: cvt.log
Environment
TensorRT Version: TensorRT v8611
NVIDIA GPU: NVIDIA A100-SXM4-40GB
NVIDIA Driver Version:
CUDA Version:
CUDNN Version:
Operating System:
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Relevant Files
Model link:
Steps To Reproduce
Commands or scripts:
trtexec --onnx=model/nvln.onnx --fp16 --noTF32 --saveEngine=model/nvln.exec.fp16.trt --layerPrecisions=LayerNormalization_*:fp32,Softmax_*:fp32,Conv_0:fp32 --layerOutputTypes=LayerNormalization_*:fp32,Softmax_*:fp32,Conv_0:fp32 --precisionConstraints=obey --timingCacheFile=x86.tc --exportLayerInfo=nvln.fp16.json --exportProfile=nvln.fp16.profile.json --profilingVerbosity=detailed --dumpProfile --verbose
Have you tried the latest release?:
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (
polygraphy run <model.onnx> --onnxrt
):