Open wildbrother opened 3 years ago
7~8GB VRAM is so high..
Can you put some breakpoints to find out which lines are causing such malloc? I've not been in this situation before. It'd be helpful if you can provide this infomation.
It appeared in the statement The first "features, regression, classification, anchors = model(x)" Vram up to 7~8GB in nvidia-smi ( it appeared 4GB,6GB,8Gm.. random... because the (watch) nvidia-smi couldn't catch every single nm frame)
when it is in the for-loop (_, regression, classification, anchors = model(x)). that phenomen is disappeared.
I found the variable 'x' is made with the image. and I was concerned about the VRAM increasing when I inference this code with every single different images.
I think, this statement had to re-call every time which I have to change inference data(image). So it will go up 7~8GB VRAM every-single-loop (for each images)??
I'm so sorry to give this question,but can you confirm this situation or made some inference code with multi-images with something like [img1,im2,img3] Thank you for your reply.
I see, it's the classification which has a large shape that consumes 4GB VRAM here during the convolution of the second feature output of the bifpn, which is weird. Because inputs
has 5 features, and the first one is the largest, but the second one is causing the most of the consumption?
What's your pytorch version? I'm using 1.7 now and I don't recall pytorch 1.4 will behave like that.
I installed packages which you wrote in here.
torch == 1.4.0 torchvision == 0.5.0
so, it is not appered to (only)me?
I was concerned, this situation is caused by my environment (docker --ipc == host)
so, is this Vram up situation will be occured every single time, when put some other images in the input? <in the statement of "features, regression, classification, anchors = model(x)">
Hi, I found out it's the cache of this line, because I can clean it by adding torch.cuda.empty_cache()
right after this line but it takes some time to empty the cache.
https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch/blob/c533bc2de65135a6fe1d25ca437765c630943afb/efficientdet/model.py#L44
So it won't allocate again after the first time. But it's still strange because I didn't come across this situation before.
ok, it takes 0.02 ~ 0.04 seconds for inference loop. deserve to do it! And now Vram's maxmum is 4GB (d3) It is half 7~8GB Great approach from one week ago. Thank you
I used efficientdet_test.py
I used d3 model
when Excuting the model, first the GPU memory use up to 7~8GB, and it is going down to 1.25GB (when the code is on the for inference loop)
I think the VRAM is going up to 7~8GB when the computer put up the model structure and weights. But I really don't understand why it has soooooooooooo high VRAM memory usage. is it really because of Overhead?
is it because of BiFPN?? or anything else??? I really want to know.