The file structure of COCO dataset

megvii-research / FQ-ViT

[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Apache License 2.0

301 stars 48 forks source link

The file structure of COCO dataset #5

Closed yifu-ding closed 2 years ago

yifu-ding commented 2 years ago

Hi, I am wondering how to specify the data dir and the file structure of COCO dataset for quantizing DeiT models? I tried to make it like ImageNet dataset which has been given in README, but there is an error:

RuntimeError: Found 0 files in subfolders of: /xxx/xxx/coco/val
Supported extensions are: .jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp

And if I add another folder in the original folder like ./val -> ./val/1/. It was no error but soon ended, and the model wasn't trained at all. It would be helpful if the authors could show the content of the directory of COCO dataset in a tree-like format. Many thanks!

Sincerely, Yifu

yifu-ding commented 2 years ago

The complete error log is:

Traceback (most recent call last):
  File "test_quant.py", line 272, in <module>
    main()
  File "test_quant.py", line 106, in main
    val_dataset = datasets.ImageFolder(valdir, val_transform)
  File "/xxx/anaconda3/envs/vit/lib/python3.7/site-packages/torchvision/datasets/folder.py", line 229, in __init__
    is_valid_file=is_valid_file)
  File "/xxx/anaconda3/envs/vit/lib/python3.7/site-packages/torchvision/datasets/folder.py", line 114, in __init__
    raise RuntimeError(msg)
RuntimeError: Found 0 files in subfolders of: /xxx/xxx/coco/val
Supported extensions are: .jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp

linyang-zhh commented 2 years ago

Hi @yifu-ding Sorry, this repo only contains the code related to the classification task on ImageNet dataset.

As far as I know, COCO is usually regarded as a dataset of object detection, while DeiT is a backbone for classification task, so I don't think they can go well together.

Are you going to implement post-training quantization on the detectors with COCO dataset? If so, I can provide further help.

yifu-ding commented 2 years ago

Thanks for your kind reply @linyang-zhh. Yes I was trying to reproduce the experiment on COCO, and I find that the detector is Mask R-CNN in the paper. Will you release the code about PTQ on Mask R-CNN with COCO dataset?

linyang-zhh commented 2 years ago

@yifu-ding Sorry, we have no plan to release the code of detection at present.

However, I can give you some insights about the implement. We use the official code of SwinTransformer. Firstly, we manually replace full precision layers (such as Conv2d, Linear, Act, LayerNorm, Attention, etc) with quantized version (here). Secondly, we apply the calibration step (just a few forward) for that detector in MMDet framework just like here. After those steps, we will obtain a calibrated detector which is able to be quantized by PTQ.

yifu-ding commented 2 years ago

Thank you very much for the instructions. I will open another issue if getting other problems.