Shared layer intentional ?

gleize commented 1 year ago

The resnet-18 layer that is used by the encoder do not seem to be deepcopied anywhere. Which means that layer1 and layer1 do share weights.

Since there is no mention of it in the paper, one might wonder if this is intentional, or potentially a bug.

mbanani commented 1 year ago

Hi @gleize, great catch! It wasn't intentional and I didn't find it until after both this paper and its follow-up byoc were published and I forgot to come back and make a note of it. If I recall correctly, fixing the bug had a small positive impact on performance. Given that this was the code used to get the results in the paper, I think it would be fair to compare against it with this performance hit.

Also, just a note on the encoder if you are building on this work, I would suggest a few things to boost performance. First, as noted in byoc, you can dispense of the rendering component as long as you use a good correspondence loss. Furthermore, as I did in SyncMatch (WACV 23) (code and paper will be release in the next month), you can increase the feature dimensionality from 32 to get another boost in performance. I was using the kNN implementation from PyTorch3D in this work and BYOC which was very slow for features larger than 32, but using the kNN from FAISS or PyKeOps allow you to use larger feature dimensions while still being fast. Finally, using a ResNet-18 also helped boost performance as long as you adjust stride to maintain a high spatial resolution like 128x128 or 256x256 (I experimented with UNets and found it did worse). Hope this helps.

gleize commented 1 year ago

Thanks for the quick reply. It doesn't need to be fixed (for reproducibility), but at least there is a note here. We've noted a similar boost when increasing the output feature dimension.

Regarding the kNN, I've also noticed the slowdown. I've made a fix on my branch, using matrix multiplication + torch.topk. Since I'm only using the cosine similarity, this was good enough. Here is the fix, you might find it useful.

We also find UNet to perform slightly worse, good to see this is consistent with your results.

mbanani / unsupervisedRR

Shared layer intentional ? #5