Closed shachoi closed 10 months ago
Yes. VPT and SSF use unnormalized images and NOAH uses normalized images. In FacT paper, we reuse the baseline results (including Adapter and LoRA) reported by NOAH, so we also use normalized inputs for FacT. While in the binary adapter paper, we implement the baselines in VPT's setting.
OK, got it! Thank you for your response.
First of all, thank you for your great contributions to PETL of ViT. All these your work and codes are very helpful.
The experimental results on VTAB-1K differ between your two papers, FacT and Binary adapter. It seems that this discrepancy is related to whether image normalization using ImageNet mean and variance is applied or not. If the images are unnormalized (as in the Binary adapter paper), their performance is better than normalized ones (as in the FacT paper). https://github.com/JieShibo/PETL-ViT/blob/026e4f12cfbe1e46bfb4d3ed5ba09f2d9f83e91e/binary_adapter/main.py#L78
Is it correct? I'm asking this question to clarify.
Best,