Thank you for your outstanding work. I observed that you've utilized the PVTv2-B4 model backbone, which boasts a larger parameter count compared to the ResNet50 and PVTv2-B2 utilized in other research methods. I'm curious if experiments were conducted on the PVTv2-B2 model and what the extent of the impact of different backbone models on performance might be. Also, can the method you've proposed be trained on a 3090 GPU?
Thank you for your outstanding work. I observed that you've utilized the PVTv2-B4 model backbone, which boasts a larger parameter count compared to the ResNet50 and PVTv2-B2 utilized in other research methods. I'm curious if experiments were conducted on the PVTv2-B2 model and what the extent of the impact of different backbone models on performance might be. Also, can the method you've proposed be trained on a 3090 GPU?