alinlab / ifseg

IFSeg: Image-free Semantic Segmentation via Vision-Language Model (CVPR 2023)
78 stars 9 forks source link

Question about the backbone. #2

Closed yexiguafuqihao closed 10 months ago

yexiguafuqihao commented 1 year ago

Did the backbone been fine-tuned on the COCO dataset before extract the feature for the cropped image?

kami93 commented 1 year ago

Hi. We do not fine-tune the backbone on the COCO dataset. In fact, the image backbone is frozen while in our image-free training. We directly load the checkpoint provided by OFA (Wang et al., 2022) in the following web link: https://github.com/OFA-Sys/OFA/blob/main/checkpoints.md. We note that various datasets (including CC12M, CC3M, SBU, COCO, etc.) are used during their pre-training, as detailed in the following link: https://github.com/OFA-Sys/OFA/blob/main/datasets.md