Open YixuanSeanZhou opened 1 month ago
For example, I want to prevent fusions, or prevent removing dead code.
For your ref, try to use follow.
polygraphy run spec.onnx --trt --best --trt-outputs mark all
Hi @lix19937 thanks for your response!
So is this CLI option supposedly to skip all the optimizations TRT does?
Also, I think --best
is not the correct option. I tried to use --int8
, but i got this errer:
[E] 2: Assertion static_cast<size_t>(c) < mSet.size() failed.
[E] 2: [cgraph.h::assertIsValidSubscript::161] Error Code 2: Internal Error (Assertion static_cast<size_t>(c) < mSet.size() failed. )
The corresponding onnx file was able to be built with TRT python API. I also have ran the polygraphy surgeon sanitize --fold-constants
before building this onnx file
Thanks in advance!
Question
Because there are so many optimizations that TRT performs, sometimes it is very hard to isolate the issue if we see regression in model accuracy. I know we have the
builder_optimization_level
flag, but it seems to only control which kernel is used when executing the model.I wonder if there is more fine-grained control? For example, I want to prevent fusions, or prevent removing dead code.
To give more context: In my specific use case, I am interested in isolating whether resolving Q/DQ nodes can causes regression in model. What I am interested to achieve is to only enable Q/DQ resolution and disable all other optimizations. Is this achievable?
Thanks in advance