Adonis-galaxy / DepthCLIP

Official implementation of "Can Language Understand Depth?"
78 stars 7 forks source link

Does this work with other networks? #6

Open jspsiy opened 2 weeks ago

jspsiy commented 2 weeks ago

I'm curious to know if i can use this clip to replace other clips in a network. I'd also like to replace the Transformer to ViTH . So far, i tried it and i don't have much luck. So i wanted to ask if this only works for RN50?

Adonis-galaxy commented 1 week ago

Hi, this work was done a long time ago and I could hardly remember what we explored. I do remember we tried ViT-based CLIP but I forgot the results. If you could share some details (including implementation details and results), I would be happy to analyze and provide some intuition.