In the ablation of paper, when you use center as the encoding points instead of 2-corners, the performance seems very similar to DAB/Condition/SMCA cross attention, have you also try using the 2-corners schema in DAB/Condition/SMCA? Just want to figure out if the major improvement of BoxRPB over DAB/Condition/SMCA is due to using 2-corners instead of 1-center.
Thanks for the great work!
I have a question on the BoxRPB:
In the ablation of paper, when you use center as the encoding points instead of 2-corners, the performance seems very similar to DAB/Condition/SMCA cross attention, have you also try using the 2-corners schema in DAB/Condition/SMCA? Just want to figure out if the major improvement of BoxRPB over DAB/Condition/SMCA is due to using 2-corners instead of 1-center.
Thanks!