Closed taobiaoli closed 1 year ago
add information. I find it seems that the loss does not converge well, as shown below.
Same problem. What's the 4bit accuracy of your model on aptq?
When I use adround and Qdrop quantization methods to do 4bit quantization, I find that the accuracy is not as good as the mse. The data and the model are the same, only in the following code Qdrop is to go advanced_ptq branch, mse is to go naive_ptq. I want to know if there is a problem with my parameter configuration? Thansks.
if __name__ == '__main__': parser = argparse.ArgumentParser(description='ImageNet Solver') parser.add_argument('--config', required=True, type=str) args = parser.parse_args() config = parse_config(args.config) # seed first seed_all(config.process.seed) # load_model device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") checkpoint = torch.load(config.model.path, map_location="cpu") model = dla.dla34() model = model.to(device) model.load_state_dict(checkpoint['state_dict']) #model = load_model(config.model) if hasattr(config, 'quantize'): model = get_quantize_model(model, config) model.to(device) # load_data train_loader, val_loader = load_eye_data(config.data.path, device) # evaluate if not hasattr(config, 'quantize'): evaluate(val_loader, model) elif config.quantize.quantize_type == 'advanced_ptq': print('begin calibration now!') cali_data = load_calibrate_data(train_loader, cali_batchsize=config.quantize.cali_batchsize) from mqbench.utils.state import enable_quantization, enable_calibration_woquantization # do activation and weight calibration seperately for quick MSE per-channel for weight one model.eval() import torch with torch.no_grad(): enable_calibration_woquantization(model, quantizer_type='act_fake_quant') for batch in cali_data: model(batch.to(device)) enable_calibration_woquantization(model, quantizer_type='weight_fake_quant') model(cali_data[0].to(device)) print('begin advanced PTQ now!') if hasattr(config.quantize, 'reconstruction'): model = ptq_reconstruction( model, cali_data, config.quantize.reconstruction) enable_quantization(model) evaluate_eye(config.data.path, model) if hasattr(config.quantize, 'deploy'): deploy(model, config) elif config.quantize.quantize_type == 'naive_ptq': print('begin calibration now!') cali_data = load_calibrate_data(train_loader, cali_batchsize=config.quantize.cali_batchsize) from mqbench.utils.state import enable_quantization, enable_calibration_woquantization # do activation and weight calibration seperately for quick MSE per-channel for weight one model.eval() enable_calibration_woquantization(model, quantizer_type='act_fake_quant') for batch in cali_data: model(batch.to(device)) # inference caculate activation scale and zp from float32 [min max] enable_calibration_woquantization(model, quantizer_type='weight_fake_quant') model(cali_data[0].to(device)) # caculate weights scale and zp from float32 [min max] print('begin quantization now!') enable_quantization(model) evaluate_eye(config.data.path, model) if hasattr(config.quantize, 'deploy'): deploy(model, config) else: print("The quantize_type must in 'naive_ptq' or 'advanced_ptq',") print("and 'advanced_ptq' need reconstruction configration.")
In the meantime, their yaml files are shown below: Qdrop
extra_prepare_dict: extra_qconfig_dict: w_observer: MSEObserver a_observer: EMAMSEObserver w_fakequantize: AdaRoundFakeQuantize a_fakequantize: QDropFakeQuantize w_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True p: 2.4 a_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True p: 2.4 quantize: quantize_type: advanced_ptq # support naive_ptq or advanced_ptq cali_batchsize: 16 reconstruction: pattern: block scale_lr: 4.0e-5 warm_up: 0.2 weight: 0.01 max_count: 20000 b_range: [20,2] keep_gpu: False round_mode: learned_hard_sigmoid prob: 0.5 model: # architecture details type: eye_tracking # model name kwargs: num_classes: 1000 path: D:\research\MQBench\application\model\model\model_best_0607.pth.tar data: path: D:\research\MQBench\dataset\data batch_size: 64 num_workers: 4 pin_memory: True input_size: 224 test_resize: 256 process: seed: 1005
mse
extra_prepare_dict: extra_qconfig_dict: w_observer: MSEObserver a_observer: EMAMSEObserver w_fakequantize: FixedFakeQuantize a_fakequantize: FixedFakeQuantize w_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True a_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True quantize: quantize_type: naive_ptq # support naive_ptq or advanced_ptq cali_batchsize: 16 deploy: output_path: D:\research\MQBench\application\model\eye_tracking model_name: 'eye_naive_mse_4_8' deploy_to_qlinear: False model: # architecture details type: eye_tracking # model name kwargs: num_classes: 1000 path: D:\research\MQBench\application\model\model\model_best_0607.pth.tar data: path: D:\research\MQBench\dataset\data batch_size: 64 num_workers: 4 pin_memory: True input_size: 224 test_resize: 256 process: seed: 1005
Same problem. What's the 4bit accuracy of your model on aptq?
Same problem. What's the 4bit accuracy of your model on aptq?
When I use adround and Qdrop quantization methods to do 4bit quantization, I find that the accuracy is not as good as the mse. The data and the model are the same, only in the following code Qdrop is to go advanced_ptq branch, mse is to go naive_ptq. I want to know if there is a problem with my parameter configuration? Thansks.
if __name__ == '__main__': parser = argparse.ArgumentParser(description='ImageNet Solver') parser.add_argument('--config', required=True, type=str) args = parser.parse_args() config = parse_config(args.config) # seed first seed_all(config.process.seed) # load_model device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") checkpoint = torch.load(config.model.path, map_location="cpu") model = dla.dla34() model = model.to(device) model.load_state_dict(checkpoint['state_dict']) #model = load_model(config.model) if hasattr(config, 'quantize'): model = get_quantize_model(model, config) model.to(device) # load_data train_loader, val_loader = load_eye_data(config.data.path, device) # evaluate if not hasattr(config, 'quantize'): evaluate(val_loader, model) elif config.quantize.quantize_type == 'advanced_ptq': print('begin calibration now!') cali_data = load_calibrate_data(train_loader, cali_batchsize=config.quantize.cali_batchsize) from mqbench.utils.state import enable_quantization, enable_calibration_woquantization # do activation and weight calibration seperately for quick MSE per-channel for weight one model.eval() import torch with torch.no_grad(): enable_calibration_woquantization(model, quantizer_type='act_fake_quant') for batch in cali_data: model(batch.to(device)) enable_calibration_woquantization(model, quantizer_type='weight_fake_quant') model(cali_data[0].to(device)) print('begin advanced PTQ now!') if hasattr(config.quantize, 'reconstruction'): model = ptq_reconstruction( model, cali_data, config.quantize.reconstruction) enable_quantization(model) evaluate_eye(config.data.path, model) if hasattr(config.quantize, 'deploy'): deploy(model, config) elif config.quantize.quantize_type == 'naive_ptq': print('begin calibration now!') cali_data = load_calibrate_data(train_loader, cali_batchsize=config.quantize.cali_batchsize) from mqbench.utils.state import enable_quantization, enable_calibration_woquantization # do activation and weight calibration seperately for quick MSE per-channel for weight one model.eval() enable_calibration_woquantization(model, quantizer_type='act_fake_quant') for batch in cali_data: model(batch.to(device)) # inference caculate activation scale and zp from float32 [min max] enable_calibration_woquantization(model, quantizer_type='weight_fake_quant') model(cali_data[0].to(device)) # caculate weights scale and zp from float32 [min max] print('begin quantization now!') enable_quantization(model) evaluate_eye(config.data.path, model) if hasattr(config.quantize, 'deploy'): deploy(model, config) else: print("The quantize_type must in 'naive_ptq' or 'advanced_ptq',") print("and 'advanced_ptq' need reconstruction configration.")
In the meantime, their yaml files are shown below: Qdrop
extra_prepare_dict: extra_qconfig_dict: w_observer: MSEObserver a_observer: EMAMSEObserver w_fakequantize: AdaRoundFakeQuantize a_fakequantize: QDropFakeQuantize w_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True p: 2.4 a_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True p: 2.4 quantize: quantize_type: advanced_ptq # support naive_ptq or advanced_ptq cali_batchsize: 16 reconstruction: pattern: block scale_lr: 4.0e-5 warm_up: 0.2 weight: 0.01 max_count: 20000 b_range: [20,2] keep_gpu: False round_mode: learned_hard_sigmoid prob: 0.5 model: # architecture details type: eye_tracking # model name kwargs: num_classes: 1000 path: D:\research\MQBench\application\model\model\model_best_0607.pth.tar data: path: D:\research\MQBench\dataset\data batch_size: 64 num_workers: 4 pin_memory: True input_size: 224 test_resize: 256 process: seed: 1005
mse
extra_prepare_dict: extra_qconfig_dict: w_observer: MSEObserver a_observer: EMAMSEObserver w_fakequantize: FixedFakeQuantize a_fakequantize: FixedFakeQuantize w_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True a_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True quantize: quantize_type: naive_ptq # support naive_ptq or advanced_ptq cali_batchsize: 16 deploy: output_path: D:\research\MQBench\application\model\eye_tracking model_name: 'eye_naive_mse_4_8' deploy_to_qlinear: False model: # architecture details type: eye_tracking # model name kwargs: num_classes: 1000 path: D:\research\MQBench\application\model\model\model_best_0607.pth.tar data: path: D:\research\MQBench\dataset\data batch_size: 64 num_workers: 4 pin_memory: True input_size: 224 test_resize: 256 process: seed: 1005
Same problem. What's the 4bit accuracy of your model on aptq?
About 64% for adound and qdrop. but mse can reach to 78%.
which model?
On Fri, Jun 9, 2023 at 11:28 taobiaoli @.***> wrote:
Same problem. What's the 4bit accuracy of your model on aptq?
When I use adround and Qdrop quantization methods to do 4bit quantization, I find that the accuracy is not as good as the mse. The data and the model are the same, only in the following code Qdrop is to go advanced_ptq branch, mse is to go naive_ptq. I want to know if there is a problem with my parameter configuration? Thansks.
if name == 'main': parser = argparse.ArgumentParser(description='ImageNet Solver') parser.add_argument('--config', required=True, type=str) args = parser.parse_args() config = parse_config(args.config)
seed first
seed_all(config.process.seed) # load_model device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") checkpoint = torch.load(config.model.path, map_location="cpu") model = dla.dla34() model = model.to(device) model.load_state_dict(checkpoint['state_dict']) #model = load_model(config.model) if hasattr(config, 'quantize'): model = get_quantize_model(model, config) model.to(device) # load_data train_loader, val_loader = load_eye_data(config.data.path, device) # evaluate if not hasattr(config, 'quantize'): evaluate(val_loader, model) elif config.quantize.quantize_type == 'advanced_ptq': print('begin calibration now!') cali_data = load_calibrate_data(train_loader, cali_batchsize=config.quantize.cali_batchsize) from mqbench.utils.state import enable_quantization, enable_calibration_woquantization # do activation and weight calibration seperately for quick MSE per-channel for weight one model.eval() import torch with torch.no_grad(): enable_calibration_woquantization(model, quantizer_type='act_fake_quant') for batch in cali_data: model(batch.to(device)) enable_calibration_woquantization(model, quantizer_type='weight_fake_quant') model(cali_data[0].to(device)) print('begin advanced PTQ now!') if hasattr(config.quantize, 'reconstruction'): model = ptq_reconstruction( model, cali_data, config.quantize.reconstruction) enable_quantization(model) evaluate_eye(config.data.path, model) if hasattr(config.quantize, 'deploy'): deploy(model, config) elif config.quantize.quantize_type == 'naive_ptq': print('begin calibration now!') cali_data = load_calibrate_data(train_loader, cali_batchsize=config.quantize.cali_batchsize) from mqbench.utils.state import enable_quantization, enable_calibration_woquantization # do activation and weight calibration seperately for quick MSE per-channel for weight one model.eval() enable_calibration_woquantization(model, quantizer_type='act_fake_quant') for batch in cali_data: model(batch.to(device)) # inference caculate activation scale and zp from float32 [min max] enable_calibration_woquantization(model, quantizer_type='weight_fake_quant') model(cali_data[0].to(device)) # caculate weights scale and zp from float32 [min max] print('begin quantization now!') enable_quantization(model) evaluate_eye(config.data.path, model) if hasattr(config.quantize, 'deploy'): deploy(model, config) else: print("The quantize_type must in 'naive_ptq' or 'advanced_ptq',") print("and 'advanced_ptq' need reconstruction configration.")
In the meantime, their yaml files are shown below: Qdrop
extra_prepare_dict: extra_qconfig_dict: w_observer: MSEObserver a_observer: EMAMSEObserver w_fakequantize: AdaRoundFakeQuantize a_fakequantize: QDropFakeQuantize w_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True p: 2.4 a_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True p: 2.4 quantize: quantize_type: advanced_ptq # support naive_ptq or advanced_ptq cali_batchsize: 16 reconstruction: pattern: block scale_lr: 4.0e-5 warm_up: 0.2 weight: 0.01 max_count: 20000 b_range: [20,2] keep_gpu: False round_mode: learned_hard_sigmoid prob: 0.5 model: # architecture details type: eye_tracking # model name kwargs: num_classes: 1000 path: D:\research\MQBench\application\model\model\model_best_0607.pth.tar data: path: D:\research\MQBench\dataset\data batch_size: 64 num_workers: 4 pin_memory: True input_size: 224 test_resize: 256 process: seed: 1005
mse
extra_prepare_dict: extra_qconfig_dict: w_observer: MSEObserver a_observer: EMAMSEObserver w_fakequantize: FixedFakeQuantize a_fakequantize: FixedFakeQuantize w_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True a_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True
quantize: quantize_type: naive_ptq # support naive_ptq or advanced_ptq cali_batchsize: 16 deploy: output_path: D:\research\MQBench\application\model\eye_tracking model_name: 'eye_naive_mse_4_8' deploy_to_qlinear: False model: # architecture details type: eye_tracking # model name kwargs: num_classes: 1000 path: D:\research\MQBench\application\model\model\model_best_0607.pth.tar data: path: D:\research\MQBench\dataset\data batch_size: 64 num_workers: 4 pin_memory: True input_size: 224 test_resize: 256 process: seed: 1005
Same problem. What's the 4bit accuracy of your model on aptq?
About 64% for adound and qdrop. but mse can reach to 78%.
— Reply to this email directly, view it on GitHub https://github.com/ModelTC/MQBench/issues/250#issuecomment-1583911861, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOWNRUVV6N25WQMDZ7KZG3XKKJ37ANCNFSM6AAAAAAY65ZT4U . You are receiving this because you commented.Message ID: @.***>
which model? … On Fri, Jun 9, 2023 at 11:28 taobiaoli @.> wrote: Same problem. What's the 4bit accuracy of your model on aptq? When I use adround and Qdrop quantization methods to do 4bit quantization, I find that the accuracy is not as good as the mse. The data and the model are the same, only in the following code Qdrop is to go advanced_ptq branch, mse is to go naive_ptq. I want to know if there is a problem with my parameter configuration? Thansks. if name == 'main': parser = argparse.ArgumentParser(description='ImageNet Solver') parser.add_argument('--config', required=True, type=str) args = parser.parse_args() config = parse_config(args.config) # seed first seed_all(config.process.seed) # load_model device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") checkpoint = torch.load(config.model.path, map_location="cpu") model = dla.dla34() model = model.to(device) model.load_state_dict(checkpoint['state_dict']) #model = load_model(config.model) if hasattr(config, 'quantize'): model = get_quantize_model(model, config) model.to(device) # load_data train_loader, val_loader = load_eye_data(config.data.path, device) # evaluate if not hasattr(config, 'quantize'): evaluate(val_loader, model) elif config.quantize.quantize_type == 'advanced_ptq': print('begin calibration now!') cali_data = load_calibrate_data(train_loader, cali_batchsize=config.quantize.cali_batchsize) from mqbench.utils.state import enable_quantization, enable_calibration_woquantization # do activation and weight calibration seperately for quick MSE per-channel for weight one model.eval() import torch with torch.no_grad(): enable_calibration_woquantization(model, quantizer_type='act_fake_quant') for batch in cali_data: model(batch.to(device)) enable_calibration_woquantization(model, quantizer_type='weight_fake_quant') model(cali_data[0].to(device)) print('begin advanced PTQ now!') if hasattr(config.quantize, 'reconstruction'): model = ptq_reconstruction( model, cali_data, config.quantize.reconstruction) enable_quantization(model) evaluate_eye(config.data.path, model) if hasattr(config.quantize, 'deploy'): deploy(model, config) elif config.quantize.quantize_type == 'naive_ptq': print('begin calibration now!') cali_data = load_calibrate_data(train_loader, cali_batchsize=config.quantize.cali_batchsize) from mqbench.utils.state import enable_quantization, enable_calibration_woquantization # do activation and weight calibration seperately for quick MSE per-channel for weight one model.eval() enable_calibration_woquantization(model, quantizer_type='act_fake_quant') for batch in cali_data: model(batch.to(device)) # inference caculate activation scale and zp from float32 [min max] enable_calibration_woquantization(model, quantizer_type='weight_fake_quant') model(cali_data[0].to(device)) # caculate weights scale and zp from float32 [min max] print('begin quantization now!') enable_quantization(model) evaluate_eye(config.data.path, model) if hasattr(config.quantize, 'deploy'): deploy(model, config) else: print("The quantize_type must in 'naive_ptq' or 'advanced_ptq',") print("and 'advanced_ptq' need reconstruction configration.") In the meantime, their yaml files are shown below: Qdrop extra_prepare_dict: extra_qconfig_dict: w_observer: MSEObserver a_observer: EMAMSEObserver w_fakequantize: AdaRoundFakeQuantize a_fakequantize: QDropFakeQuantize w_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True p: 2.4 a_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True p: 2.4 quantize: quantize_type: advanced_ptq # support naive_ptq or advanced_ptq cali_batchsize: 16 reconstruction: pattern: block scale_lr: 4.0e-5 warm_up: 0.2 weight: 0.01 max_count: 20000 b_range: [20,2] keep_gpu: False round_mode: learned_hard_sigmoid prob: 0.5 model: # architecture details type: eye_tracking # model name kwargs: num_classes: 1000 path: D:\research\MQBench\application\model\model\model_best_0607.pth.tar data: path: D:\research\MQBench\dataset\data batch_size: 64 num_workers: 4 pin_memory: True input_size: 224 test_resize: 256 process: seed: 1005 mse extra_prepare_dict: extra_qconfig_dict: w_observer: MSEObserver a_observer: EMAMSEObserver w_fakequantize: FixedFakeQuantize a_fakequantize: FixedFakeQuantize w_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True a_qscheme: bit: 4 symmetry: True per_channel: True pot_scale: True quantize: quantize_type: naive_ptq # support naive_ptq or advanced_ptq cali_batchsize: 16 deploy: output_path: D:\research\MQBench\application\model\eye_tracking model_name: 'eye_naive_mse_4_8' deploy_to_qlinear: False model: # architecture details type: eye_tracking # model name kwargs: num_classes: 1000 path: D:\research\MQBench\application\model\model\model_best_0607.pth.tar data: path: D:\research\MQBench\dataset\data batch_size: 64 num_workers: 4 pin_memory: True input_size: 224 test_resize: 256 process: seed: 1005 Same problem. What's the 4bit accuracy of your model on aptq? About 64% for adound and qdrop. but mse can reach to 78%. — Reply to this email directly, view it on GitHub <#250 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOWNRUVV6N25WQMDZ7KZG3XKKJ37ANCNFSM6AAAAAAY65ZT4U . You are receiving this because you commented.Message ID: @.>
our custom model.
First, there is something wrong with your configs both for advanced ptq and naive ones. Configs for activation (a_qscheme): we can not use per-channel quantization for activation which prohibits low-bit inference acceleration, please turn it into False. Also, pot_scale is not a usual setting, which turns on non-uniform quantization. Please consider the configs files in application/imagenet_example/PTQ/configs/qdrop. Thank you.
First, there is something wrong with your configs both for advanced ptq and naive ones. Configs for activation (a_qscheme): we can not use per-channel quantization for activation which prohibits low-bit inference acceleration, please turn it into False. Also, pot_scale is not a usual setting, which turns on non-uniform quantization. Please consider the configs files in application/imagenet_example/PTQ/configs/qdrop. Thank you. As you suggested, I set per-channel to false in a_qscheme, and the result got even worse, only 44%。
This issue has not received any updates in 120 days. Please reply to this issue if this still unresolved!
When I use adround and Qdrop quantization methods to do 4bit quantization, I find that the accuracy is not as good as the mse. The data and the model are the same, only in the following code Qdrop is to go advanced_ptq branch, mse is to go naive_ptq. I want to know if there is a problem with my parameter configuration? Thansks.
In the meantime, their yaml files are shown below: Qdrop
mse