I have a few things that I don't understand.
In memory requirements during learning, the results I checked are as follows (two gpus) :
PascalVOC2012 with training crop_size 321 and Resnet50 : gpu1 - 15417 MiB/24268MiB & gpu2 - 15417 MiB/24268MiB
PascalVOC2012 with training crop_size 513 and Resnet50 : gpu1 - 22583 MiB/24268MiB & gpu2 - 22583 MiB/24268MiB
Here, I don't understand.
Most other recent SOTA studies also use the input resolution of 512 or 513.
Most other SOTA studies do not require much GPU memory to train (At 513 resolution and Resnet50, mostly below 15000 Mib).
In my opinion, your research approach doesn't require much computing cost.
As you can see from the memory results above, I don't understand why so many resources are needed.
Can you give me an answer to this part?
Thank you for your wonderful research.
I have a few things that I don't understand. In memory requirements during learning, the results I checked are as follows (two gpus) :
Here, I don't understand. Most other recent SOTA studies also use the input resolution of 512 or 513. Most other SOTA studies do not require much GPU memory to train (At 513 resolution and Resnet50, mostly below 15000 Mib). In my opinion, your research approach doesn't require much computing cost. As you can see from the memory results above, I don't understand why so many resources are needed. Can you give me an answer to this part?
Thank you in advance. Good luck.