About the pretrained imagenet encoder:

Thank you very much for your work. I have a question I would like to consult:

In the paper, it is mentioned that features are extracted using a backbone pre-trained on ImageNet, which are then used as input for UniAD. In my application scenario, due to the significant distribution difference between the industrial dataset and the ImageNet image dataset, the features extracted using the ImageNet pre-trained model are very similar and almost indistinguishable. What is your understanding of this issue?

zhiyuanyou / UniAD

About the pretrained imagenet encoder: #37