Open tobiasvanderwerff opened 5 months ago
I think I found a decent solution. The interpolate_patch_14to16.py
script can be modified in the following way:
module
key, not model
. I.e. use checkpoint['module']
instead of checkpoint['model']
. pos_tokens = pos_tokens.float() # convert to float32 because float16 is not supported for bicubic interpolation
pos_tokens = torch.nn.functional.interpolate(
pos_tokens, size=(new_size, new_size), mode='bicubic', align_corners=False)
pos_tokens = pos_tokens.half() # convert back to float16
Hi @tobiasvanderwerff, I think the same holds for patch_embed:
patch_embed = torch.nn.functional.interpolate(patch_embed.float(), size=(16, 16), mode='bicubic', align_corners=False)
While there is already the .float()
, making the interpolate correctly work, the .half()
to convert back to float16 is missing. Btw, thanks for the hint!
Hi,
First of all, thank you for the great work you've published. I am trying to train EVA 2 on a custom object detection dataset and noticed that the
*_p14to16
pre-trained models are only available for EVA-B and EVA_L (in this table), but not for the other model sizes. I am trying to use the smaller EVA S and/or Ti models instead. As far as I understand, the conversion fromp14
top16
involves a linear interpolation of thepos_embed
parameters, as mentioned here. This would mean that it could also be applied as a post-processing step of the checkpoint file for the smaller models.I have tried to do the interpolation myself, by using the interpolate_patch_14to16.py script. However, this does not seem to work for the EVA 2 checkpoints, because of an error in accessing key values in the checkpoint:
I am not quite sure if applying the script would be the right approach to take or if another approach is necessary. Could you provide any feedback on this? Thanks in advance!