Closed Tetsujinfr closed 4 years ago
This appears to be a GPU out of memory related error. Even with an evaluation batch size of 1, my GPU required about 7.3GB of memory. It seems like your machine has around 6GB of memory?
Thanks. Yes correct, my gpu has 6gb of memory. Have you been succesful on your side in running the inference on the dataset Shreyas?
Is there a way to reduce the memory usage somehow?
@Tetsujinfr : I've managed to run the code successfully on a few images. I'm still fighting a few other issues.
I'm not sure how to properly reduce memory usage, but an initial idea wold be to reduce the resolution of all the images. This could impact the quality of the results, but might be worth looking into if you don't have access to a larger GPU.
The authors might have a better idea on what to do to reduce memory usage (eg: free-ing up memory that's not being used).
@Tetsujinfr
It appears that your GPU memory is insufficient. The model we've provided as a reference uses WideResNet38 which is quite memory hungry, but you could try it with a lighter encoder network like ResNet50 or VGG16. However, this will require some tweaking of the architecture and the code.
I also imagine that there are probably parts of the code that can be optimized further in terms of memory consumption (by freeing up unused resources), but I haven't done this work yet as I personally didn't have any memory constraints. I can see if I can optimize some parts of the code later when I'm less busy and if there is enough interest.
Reducing the resolution is another option too, if you're okay with reduced resolution / quality segmentations!
Ok thanks. I will try the resolution first since that should be the fastest compromise to explore. Then I will have a look at the memory usage across the process to see if I can free up some unused memory, but if you can suggest some places to look at I would definitely welcome them. I was able to run the WideResNet38 backbone for the semantic segmentation repo without issue, but this code is probably more demanding.
Hi, I am not sure the below is an error but at the same time, looking at the train.py evaluate function, it is supposed to output "threshold" and F-Score values, so I do not think the code ran up to the end properly. I can see there is an out of memory error message before processing frames but I do not know if the memory issue was a critical error since the code continued running. The frame section is not clearly looking like an error, but at the same time it could be.
Can you tell me if this is the expected output?
thanks
gscnn_train_eval_run_output.txt