Open YixuanSeanZhou opened 2 weeks ago
For a repro, I believe this should error out on any quantization flow if you use a layer with config:
{'num_bits': 8, 'axis': None, 'learn_amax': False, "calibrator": "histogram"}
histogram
is not supported as a calibration algorithm iiuc, @realAsma can you check if calibrator field options looks correct.@riyadshairi979 Thank you so much for your response:
histogram is not supported as a calibration algorithm iiuc, @realAsma can you check if calibrator field options looks correct.
This is unexpected. Based on the modelopt documentation, there is histogram as the calibration option. The same calibration options seems to exists in the original pytorch-quantization tooling from TRT. My assumption is modelopt is a better tooling comparing to before, meaning more comprehensive. Did I miss something 🤔
How did you define the quantization config, share the code snippet please.
This is my config, the reason I want to use histogram is to avoid the "outliers" for the amax in the calib examples.
qconfig = deepcopy(mtq.INT8_DEFAULT_CFG)
qconfig["quant_cfg"]["heads_3.*.inputs_quantizer*"] = {'num_bits': 8, 'axis': None, 'learn_amax': False, "calibrator": "histogram"}
What is your modelopt version?
Upgraded to the latest, 0.15.1, but still having the same issue.
Thanks! Looking forward to your response!
Hi @YixuanSeanZhou ,
Your usage is correct.
There is a bug in ModelOpt when histogram
calibrator is used.
Also, could you please provide some examples where histogram is being used as the calibrator? It also has 3 ways to calculates amax, but I wonder how that is being specified.
I am sorry, I do not have any examples currently using histogram
calibrator.
For now, could you please modify you script to perform calibrator manually instead of via mtq.quantize
?
Here is an example code
import torch
from modelopt.torch.quantization.model_calib import enable_stats_collection, finish_stats_collection
from modelopt.torch.quantization.model_quant import apply_mode
model = torch.nn.Linear(1024, 2048).cuda()
# An example config
config = {"quant_cfg": {"*": {"calibrator": "histogram"}}, "algorithm": "max"}
def calibrate_loop(model):
# A method which simply forwards data through the model
return model(torch.randn(1, 1024).cuda())
# config is the same config that was passed previously to mtq.quantize
model = apply_mode(model, mode=[("quantize", config)])
enable_stats_collection(model)
calibrate_loop(model)
finish_stats_collection(model, method="mse")
# Manually move the model to cuda after calibration
model.cuda()
print(model)
# Get simulation quantized output
output = calibrate_loop(model)
# Do ONNX export now
...
Thanks for your response @realAsma and sorry for my delayed response (i was away last week). This script looks helpful and promising.
I have two questions I want to follow up:
model_calib
seems isn't open sourced. By any chance we can open source it or at least provide a documentation on the methods within that file? finish_stats_collection
what are all the methods available? Will they be max
, entropy
, percentile
and mse
? If using percentile, how do we specify the which percentile I will be using? Thanks so much!
Hi,
When running
mtq.quantize
with"calibrator": "historgam"
in my config, i got the following assert errorTracing a bit higher, the assert error comes from
Could you please take a look on what I did was wrong?
Also, could you please provide some examples where histogram is being used as the calibrator? It also has 3 ways to calculates amax, but I wonder how that is being specified.
Thanks in advance,