Hi. I have noticed in the paper that you mentioned that you used sinusoidal embedding (which is non-learnable) as per DETR. While in code I see that you use PositionEmbeddingLearned from modules.py which consists of 1D convolutions and is xyz dependent in forward pass (and henece learnable). I noticed that this class is used with different parameters for both box encoder and also the cross modal encoding. I wanted to know how both of these statements are related or if I am missing the sinusoidal encodings at any point. Alongside this I also notice the class definition of PositionEmbeddingLearned in two files with their codes being the same. Is there any reason as such to replicate it?
As mentioned above, in 3D we use learned positioned embeddings with XYZ as input same as GroupFree model.
About duplication: no specific reason, that might just be an oversight.
Hi. I have noticed in the paper that you mentioned that you used sinusoidal embedding (which is non-learnable) as per DETR. While in code I see that you use PositionEmbeddingLearned from modules.py which consists of 1D convolutions and is xyz dependent in forward pass (and henece learnable). I noticed that this class is used with different parameters for both box encoder and also the cross modal encoding. I wanted to know how both of these statements are related or if I am missing the sinusoidal encodings at any point. Alongside this I also notice the class definition of PositionEmbeddingLearned in two files with their codes being the same. Is there any reason as such to replicate it?