Thanks for sharing the code; I have a question or help me with how I can use mae as a backbone for the object detection framework?
If you have any guidance, I'd appreciate it if you could please help me with that?
Mae is only the pretrain, it also is build on the VIT model which have Encoder and Decoder, remove the decoder, U can use the vit encoder as the backbone and add the detection head to make detection.
Thanks for sharing the code; I have a question or help me with how I can use mae as a backbone for the object detection framework? If you have any guidance, I'd appreciate it if you could please help me with that?