Feature alignment loss - Githubissues

LiheYoung / Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

https://depth-anything.github.io

Apache License 2.0

7.01k stars 539 forks source link

Feature alignment loss #224

Closed schatto02 closed 2 months ago

schatto02 commented 3 months ago

Hi, thanks for such great work ! I was wondering which features are used for this loss --- do we use intermediate features or the final encoder features?

Also, if the student and teacher feature dimensions are different, what kind of projection is used to bring them to a compatible feature space?

LiheYoung commented 2 months ago

We use the final encoder features.
We use the same student-teacher structure (e.g., both ViT-Large) for alignment, so the dimensions are the same. If the dimensions are different, we recommend adding a linear projection layer on top of the student features.