pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
https://pytorch.org/TensorRT
BSD 3-Clause "New" or "Revised" License
2.53k stars 349 forks source link

✨[Feature] add working unified example INT8 quantization and compilation #2961

Open lebionick opened 3 months ago

lebionick commented 3 months ago

I would like to quantize my model to INT8 precision and then compile it using torch_tensorrt. Unfortunately, it is transformer based vision model and default way to do it - does not work.

def test_torch_tensorrt_int_dynamic_compile():
    jit_path = "effvit.jit.pt"
    max_batch_size = 4
    cd = CalibDataset()
    calib_dataloader = data.DataLoader(cd, batch_size=max_batch_size, shuffle=False, drop_last=True)
    print("Tracing...")
    model = create_model()
    with torch.no_grad():
        jit_model = torch.jit.trace(model, torch.empty([1, 3, 512, 512], dtype=torch.float).to("cuda"))
        torch.jit.save(jit_model, jit_path)
    baseline_model = torch.jit.load(jit_path).eval()
    calibrator = DataLoaderCalibrator(
        calib_dataloader,
        algo_type=CalibrationAlgo.ENTROPY_CALIBRATION_2,
        device=torch.device("cuda:0"),
        use_cache=False,
    )
    print("Compiling...")
    opt_model = torch_tensorrt.compile(
        baseline_model,
        inputs=[torch_tensorrt.Input([max_batch_size, 3, 512, 512])],
        enabled_precisions={torch.int8, torch.float32, torch.float16},
        truncate_long_and_double=True,
        calibrator=calibrator,
    )
    opt_model(create_inputs(max_batch_size))
    print("Benchmark dynamic INT8 torch_tensorrt Compile")
    bench(opt_model, create_inputs(max_batch_size))
    torch.jit.save(opt_model, "backbone_dyn.ts")

It gives output:

WARNING:root:Given dtype that does not have direct mapping to torch (dtype.unknown), defaulting to torch.float
WARNING:torch_tensorrt._compile:Input is a torchscript module but the ir was not specified (default=dynamo), please set ir=torchscript to suppress the warning.
WARNING:root:Given dtype that does not have direct mapping to torch (dtype.unknown), defaulting to torch.float
WARNING: [Torch-TensorRT] - Truncating weight (constant in the graph) from Float64 to Float32
WARNING: [Torch-TensorRT] - Truncating weight (constant in the graph) from Float64 to Float32
WARNING: [Torch-TensorRT] - Truncating weight (constant in the graph) from Float64 to Float32
WARNING: [Torch-TensorRT] - Truncating weight (constant in the graph) from Float64 to Float32
WARNING: [Torch-TensorRT] - Truncating weight (constant in the graph) from Float64 to Float32
WARNING: [Torch-TensorRT] - Truncating weight (constant in the graph) from Float64 to Float32
WARNING: [Torch-TensorRT TorchScript Conversion Context] - Heuristics has been ignored in this builder run. This feature is only supported on Ampere and beyond.
CalibDataset is in use!
WARNING: [Torch-TensorRT] - Detected this engine is being instantitated in a multi-GPU system with multi-device safe mode disabled. For more on the implications of this as well as workarounds, see the linked documentation (https://pytorch.org/TensorRT/user_guide/runtime.html#multi-device-safe-mode)
WARNING: [Torch-TensorRT TorchScript Conversion Context] - Heuristics has been ignored in this builder run. This feature is only supported on Ampere and beyond.
ERROR: [Torch-TensorRT TorchScript Conversion Context] - 4: [standardEngineBuilder.cpp::initCalibrationParams::1945] Error Code 4: Internal Error (Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.)
Traceback (most recent call last):
  File "/rep/ros2/src/perception/image_segmenter/image_segmenter/convert_torch_tensorrt.py", line 303, in <module>
    test_torch_tensorrt_int_dynamic_compile()
  File "/rep/ros2/src/perception/image_segmenter/image_segmenter/convert_torch_tensorrt.py", line 199, in test_torch_tensorrt_int_dynamic_compile
    opt_model = torch_tensorrt.compile(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch_tensorrt/_compile.py", line 208, in compile
    compiled_ts_module: torch.jit.ScriptModule = torchscript_compile(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch_tensorrt/ts/_compiler.py", line 151, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: [Error thrown at core/conversion/conversionctx/ConversionCtx.cpp:169] Building serialized network failed in TensorRT

I also tried passing plain torch model to compile method, and it gives output:

WARNING:root:Given dtype that does not have direct mapping to torch (dtype.unknown), defaulting to torch.float
INFO:torch_tensorrt._compile:ir was set to default, using dynamo frontend
INFO:torch_tensorrt.dynamo._compiler:Compilation Settings: CompilationSettings(enabled_precisions={<dtype.i8: 3>, <dtype.f16: 6>, <dtype.f32: 7>}, debug=False, workspace_size=0, min_block_size=5, torch_executed_ops=set(), pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_double=True, use_fast_partitioner=True, enable_experimental_decompositions=False, device=Device(type=DeviceType.GPU, gpu_id=0), require_full_compilation=False, disable_tf32=False, assume_dynamic_shape_support=False, sparse_weights=False, refit=False, engine_capability=<EngineCapability.STANDARD: 1>, num_avg_timing_iters=1, dla_sram_size=1048576, dla_local_dram_size=1073741824, dla_global_dram_size=536870912, dryrun=False, hardware_compatible=False)

WARNING:root:Given dtype that does not have direct mapping to torch (dtype.unknown), defaulting to torch.float
INFO:torch_tensorrt [TensorRT Conversion Context]:[MemUsageChange] Init CUDA: CPU +1, GPU +0, now: CPU 703, GPU 1750 (MiB)
INFO:torch_tensorrt [TensorRT Conversion Context]:[MemUsageChange] Init builder kernel library: CPU +945, GPU +178, now: CPU 1784, GPU 1928 (MiB)
INFO:torch_tensorrt.dynamo.conversion._TRTInterpreter:TRT INetwork construction elapsed time: 0:00:32.841368
INFO:torch_tensorrt [TensorRT Conversion Context]:BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32.
WARNING:torch_tensorrt [TensorRT Conversion Context]:Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32 or Bool.
ERROR:torch_tensorrt [TensorRT Conversion Context]:4: [standardEngineBuilder.cpp::initCalibrationParams::1945] Error Code 4: Internal Error (Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.)
Traceback (most recent call last):
  File "/rep/ros2/src/perception/image_segmenter/image_segmenter/convert_torch_tensorrt.py", line 303, in <module>
    test_torch_tensorrt_int_dynamic_compile()
  File "/rep/ros2/src/perception/image_segmenter/image_segmenter/convert_torch_tensorrt.py", line 199, in test_torch_tensorrt_int_dynamic_compile
    opt_model = torch_tensorrt.compile(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch_tensorrt/_compile.py", line 249, in compile
    trt_graph_module = dynamo_compile(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch_tensorrt/dynamo/_compiler.py", line 227, in compile
    trt_gm = compile_module(gm, inputs, settings)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch_tensorrt/dynamo/_compiler.py", line 412, in compile_module
    trt_module = convert_module(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 106, in convert_module
    interpreter_result = interpret_module_to_result(module, inputs, settings)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 87, in interpret_module_to_result
    interpreter_result = interpreter.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 323, in run
    assert serialized_engine
AssertionError

After this, I've found torch_tensorrt.dynamo API, but it seems there is no INT8 support. I took fresh https://github.com/pytorch/TensorRT/blob/main/examples/dynamo/vgg16_fp8_ptq.py example, and tried to change FP8 to INT8 (because my 2080ti does not support it according to error) and torch.export breaks:

Traceback (most recent call last):
  File "/rep/vgg_int8_ptq.py", line 230, in <module>
    exp_program = torch.export.export(model, (input_tensor,))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/export/__init__.py", line 174, in export
    return _export(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/export/_trace.py", line 635, in wrapper
    raise e
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/export/_trace.py", line 618, in wrapper
    ep = fn(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/export/exported_program.py", line 83, in wrapper
    return fn(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/export/_trace.py", line 860, in _export
    gm_torch_level = _export_to_torch_ir(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/export/_trace.py", line 347, in _export_to_torch_ir
    gm_torch_level, _ = torch._dynamo.export(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1311, in inner
    result_traced = opt_f(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 921, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state, skip=1)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 400, in _convert_frame_assert
    return _compile(
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 676, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 535, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1036, in transform_code_object
    transformations(instructions, code_options)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 165, in _fn
    return fn(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 500, in transform
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2149, in run
    super().run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1219, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/nn_module.py", line 272, in call_function
    tx.call_function(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/nn_module.py", line 336, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1260, in CALL_FUNCTION_EX
    self.call_function(fn, argsvars.items, kwargsvars)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 335, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1260, in CALL_FUNCTION_EX
    self.call_function(fn, argsvars.items, kwargsvars)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/misc.py", line 562, in call_function
    return self.obj.call_method(tx, self.name, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/misc.py", line 142, in call_method
    ).call_function(tx, [self.objvar] + args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1219, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/nn_module.py", line 336, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1260, in CALL_FUNCTION_EX
    self.call_function(fn, argsvars.items, kwargsvars)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 335, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1219, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 335, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1219, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/misc.py", line 562, in call_function
    return self.obj.call_method(tx, self.name, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/misc.py", line 420, in call_method
    return self.call_apply(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/misc.py", line 379, in call_apply
    return variables.UserFunctionVariable(fn, source=source).call_function(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1219, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 242, in impl
    self.push(fn_var.call_function(self, self.popn(nargs), {}))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/builtin.py", line 679, in call_function
    res = binop_handler(tx, args[0], args[1], {})
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/builtin.py", line 300, in user_defined_handler
    return a.call_method(tx, forward_name, [b], {})
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/user_defined.py", line 576, in call_method
    return UserMethodVariable(method, self, source=source).call_function(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 335, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1219, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/misc.py", line 562, in call_function
    return self.obj.call_method(tx, self.name, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/user_defined.py", line 576, in call_method
    return UserMethodVariable(method, self, source=source).call_function(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 335, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1219, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/misc.py", line 562, in call_function
    return self.obj.call_method(tx, self.name, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/user_defined.py", line 584, in call_method
    return super().call_method(tx, name, args, kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/variables/base.py", line 368, in call_method
    raise unimplemented(f"call_method {self} {name} {args} {kwargs}")
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/_dynamo/exc.py", line 190, in unimplemented
    raise Unsupported(msg)
torch._dynamo.exc.Unsupported: call_method UserDefinedObjectVariable(PosixPath) _parse_args [TupleVariable()] {}

from user code:
   File "/rep/vgg_int8_ptq.py", line 71, in forward
    x = self.features(x)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/modelopt/torch/quantization/nn/modules/quant_module.py", line 83, in forward
    return super().forward(input, *args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/modelopt/torch/quantization/nn/modules/quant_module.py", line 39, in forward
    input = self.input_quantizer(input)
  File "/home/nikolay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/modelopt/torch/quantization/nn/modules/tensor_quantizer.py", line 639, in forward
    outputs = self._quant_forward(inputs)
  File "/home/nikolay/.local/lib/python3.10/site-packages/modelopt/torch/quantization/nn/modules/tensor_quantizer.py", line 419, in _quant_forward
    outputs = fake_tensor_quant(
  File "/home/nikolay/.local/lib/python3.10/site-packages/modelopt/torch/quantization/tensor_quant.py", line 445, in forward
    cuda_ext = get_cuda_ext()
  File "/home/nikolay/.local/lib/python3.10/site-packages/modelopt/torch/quantization/extensions.py", line 34, in get_cuda_ext
    sources=[path / "src/tensor_quant.cpp", path / "src/tensor_quant_gpu.cu"],
  File "/usr/lib/python3.10/pathlib.py", line 855, in __truediv__
    return self._make_child((key,))
  File "/usr/lib/python3.10/pathlib.py", line 616, in _make_child
    drv, root, parts = self._parse_args(args)

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

I would like to see recommended complete example of quantizing and converting model to INT8 precision with torch_tensorrt. I am using: CUDA 12.1 tensorrt==10.0.1 tensorrt-cu12==10.0.1 tensorrt-cu12-bindings==10.0.1 tensorrt-cu12-libs==10.0.1 torch_tensorrt==2.3.0 torch==2.3 nvidia-modelopt==0.13.0

peri044 commented 3 months ago

Hello @lebionick In the example you've provided, it looks like you're using a torchscript model but you're using our dynamo backend. We currently do not support INT8 in dynamo and plan to work this feature in the coming weeks. The workflow would be very similar to the FP8 workflow you've tried but it involves more changes than dtypes.