TongkunGuan / CCD

[ICCV2023] Self-supervised Character-to-Character Distillation for Text Recognition
https://openaccess.thecvf.com/content/ICCV2023/papers/Guan_Self-Supervised_Character-to-Character_Distillation_for_Text_Recognition_ICCV_2023_paper.pdf
140 stars 7 forks source link

Runtime error in dino_vision.py file: shape '[16, 8, 32, 512]' is invalid for input of size 131072 #7

Closed truongpnx closed 9 months ago

truongpnx commented 9 months ago

Hello,

I currently training your model (vit_base) on my own dataset, but this issue came up: RuntimeError: shape '[16, 8, 32, 512]' is invalid for input of size 131072 This error came from line 56 in Dino/model/dino_vision.py

The backbone_out shape is torch.Size([16, 16, 512])

region_f = backbone_out.reshape(N, 8, 32, E).permute(0, 3, 1, 2)

Can you please explain why 8, 32 in this line?

This issue came up because I changed the patch_size=16 and use_fp16=True in the config file, but if I still remain patch_size=4 then it can normally run.

I run with batch_size_per_gpu=16

TongkunGuan commented 9 months ago

Hello,

I currently training your model (vit_base) on my own dataset, but this issue came up: RuntimeError: shape '[16, 8, 32, 512]' is invalid for input of size 131072 This error came from line 56 in Dino/model/dino_vision.py

The backbone_out shape is torch.Size([16, 16, 512])

region_f = backbone_out.reshape(N, 8, 32, E).permute(0, 3, 1, 2)

Can you please explain why 8, 32 in this line?

This issue came up because I changed the patch_size=16 and use_fp16=True in the config file, but if I still remain patch_size=4 then it can normally run.

I run with batch_size_per_gpu=16 For patch_size=4, region_f = backbone_out.reshape(N, image_h//4, image_w//4, E).permute(0, 3, 1, 2) In our work, image.shape=(32,128) For patch_size=16, you need modify _region_f = backbone_out.reshape(N, image_h//16, image32//16, E).permute(0, 3, 1, 2)

truongpnx commented 9 months ago

Thank you so much, I fixed it.

Although I modified region_f = backbone_out.reshape(N, image_h//16, image_w//16, E).permute(0, 3, 1, 2), still I need to change the shape in line 238 return x.reshape(x.shape[0], 8, 32, -1).permute(0, 3, 1, 2) in to_2D() for normally running