The difference between the experimental results on VTAB-1K in your two papers.

shachoi commented 10 months ago

First of all, thank you for your great contributions to PETL of ViT. All these your work and codes are very helpful.

The experimental results on VTAB-1K differ between your two papers, FacT and Binary adapter. It seems that this discrepancy is related to whether image normalization using ImageNet mean and variance is applied or not. If the images are unnormalized (as in the Binary adapter paper), their performance is better than normalized ones (as in the FacT paper). https://github.com/JieShibo/PETL-ViT/blob/026e4f12cfbe1e46bfb4d3ed5ba09f2d9f83e91e/binary_adapter/main.py#L78

Is it correct? I'm asking this question to clarify.

Best,

JieShibo commented 10 months ago

Yes. VPT and SSF use unnormalized images and NOAH uses normalized images. In FacT paper, we reuse the baseline results (including Adapter and LoRA) reported by NOAH, so we also use normalized inputs for FacT. While in the binary adapter paper, we implement the baselines in VPT's setting.

shachoi commented 10 months ago

OK, got it! Thank you for your response.

JieShibo / PETL-ViT

The difference between the experimental results on VTAB-1K in your two papers. #14