Zhen-Dong / HAWQ

Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
MIT License
406 stars 83 forks source link

Fail to Run TVM tests #18

Open sqPoseidon opened 2 years ago

sqPoseidon commented 2 years ago

Dear authors,

I've tried to run the tvm test: hawq_utils_resnet50.py but failed since your provided .pth files (https://drive.google.com/file/d/1Ldo51ZPx6_2Eq60JgbL6hdPdQf5WbRf9/view?usp=sharing) cannot match the dictionaries in hawq_utils_resnet50.py. This issue has been reported in https://github.com/Zhen-Dong/HAWQ/issues/10#issue-834706092. I've modified the hawq_utils_resnet50.py to support resnet18. For consistency of experimental results, I wish that you can provide us the quantized resnet50 model file.

Besides, the input_image_batch_1.npy file is also not provided. I failed to run test_resnet_inference.py too by generating the image .npy file by myself. The error message is as below:

File "test_resnet_inference.py", line 75, in <module>
    input_image = np.trunc(input_image)

TypeError: type numpy.ndarray doesn't define __trunc__ method

I'm not sure what's the reason resulting to this problem.

Would you please provide us these necessary files?

Thank you in advance.

zachzzc commented 2 years ago

Hi,

1) Can you share you error messages for the dictionary? We will check again and upload the correct version

2) The input_image_batch_1.npy is here https://github.com/Zhen-Dong/HAWQ/tree/main/tvm_benchmark/models

3) The numpy error should be caused by the numpy version. What is the version of you numpy? I see 1.2.1 should still have this function defined.

sqPoseidon commented 2 years ago

Hi,

  1. Can you share you error messages for the dictionary? We will check again and upload the correct version
  2. The input_image_batch_1.npy is here https://github.com/Zhen-Dong/HAWQ/tree/main/tvm_benchmark/models
  3. The numpy error should be caused by the numpy version. What is the version of you numpy? I see 1.2.1 should still have this function defined.

Hi, thank you for the quick reply. Maybe I've tried the error commands or used the wrong files. It's my pleasure if you can help figure out the problems.

  1. I've tried to run python hawq_utils_resnet50.py --model-dir ./data. Two models are tested: ResNet50 W8A8 (https://drive.google.com/file/d/1Ldo51ZPx6_2Eq60JgbL6hdPdQf5WbRf9/view?usp=sharing) and ResNet50 W4A4 (https://drive.google.com/file/d/1DDis-8C-EupCRj-ExH58ldSv-tG2RXyf/view?usp=sharing). Both these two tests report the error message:
    File "hawq_utils_resnet50.py", line 479, in <module>
      model = torch.load(file_name)
    File "xxx/site-packages/torch/serialization.py", line 579, in load
      with _open_file_like(f, 'rb') as opened_file:
    File "xxx/site-packages/torch/serialization.py", line 230, in _open_file_like
      return _open_file(name_or_buffer, mode)
    File "xxx/site-packages/torch/serialization.py", line 211, in __init__
      super(_open_file, self).__init__(open(name, mode))
    FileNotFoundError: [Errno 2] No such file or directory: 'xxx/quantized_checkpoint.pth.tar'

    There are no quantized_checkpoint.pth.tar files in the provided download links. Then I tried to use the ResNet18 model (https://drive.google.com/file/d/1CLAd3LhiRVYwiBZRuUJgrzrrPFfLvfWG/view?usp=sharing), the problem is there are no qconfig modules in the model. I'm not sure whether these items are important and whether we can reproduce the results while ignoring these items (https://github.com/Zhen-Dong/HAWQ/issues/10#issuecomment-808662788).

Or should I follow the guides to generate the quantized models by myself?

  1. My numpy version is 1.19.1. Do I need to downgrade the numpy?
zachzzc commented 2 years ago
  1. I think the quantized_checkpoint got renamed for some reasons.... You can extract the zipped file and rename it quantized_checkpoint.pth.tar from checkpoint.pth.tar

  2. I realized the problem may not be caused by the version but the image data. Can you try to use the image I provided? If the image.npy you generated is type of integer, it may causes an error.

mu94-csl commented 1 year ago

@zachzzc no, the quantized_checkpoint are not present for most models in the zoo

One has to extract the quantized weights themselves using torch (https://github.com/Zhen-Dong/HAWQ/issues/13#issuecomment-820224642)

For how to do this, please see the validate function in the quant train file