About the params setting of use_checkpoint

Sense-X / UniFormer

[ICLR2022] official implementation of UniFormer

Apache License 2.0

828 stars 111 forks source link

About the params setting of use_checkpoint #93

Closed go-ahead-maker closed 2 years ago

go-ahead-maker commented 2 years ago

Hi, sorry for disturbing you again. I notice that the use_checkpoint is adopted in Uniformer for saving the GPU memory. Recently, I try to conduct some exp of my custom model for object detection, and I find the GPU consumption seems continually increase over time (every few iterations will add a few more), I'm not sure what's causing this. So, I wonder if use_checkpoint could alleviate the problem. Does this(launch use_checkpoint ) lead to slower training speed?

Best.

go-ahead-maker commented 2 years ago

By the way, If I launch use_checkpoint, does it hurts the final performance?

Andy1621 commented 2 years ago

use_checkpoint will not hurt the final performance, but it decreases the training speed for saving GPU memory.

The leaked memory didn't exist in my experiments. Maybe there are some other bugs that need to be fixed.

go-ahead-maker commented 2 years ago

Thanks for your quick reply. I have checked the log file of swin. The memory also increases during the training process (from the initial 7212 to 7909), and I also notice that it has slight changes in your log file (5162 to 5167). So I suppose that this phenomenon may be normal for mmdet. In my experiments, I follow your config that the batch size is 4 4 (4 samples per GPU). Is this equivalent to 8 2?

Thanks again for your valuable reply!

Andy1621 commented 2 years ago

Thanks for your try! If you keep the same total batch size, it should be equivalent.

Andy1621 commented 2 years ago

As there is no more activity, I am closing the issue, don't hesitate to reopen it if necessary.