question about GPU memory

xamyzhao / brainstorm

Implementation of "Data augmentation using learned transforms for one-shot medical image segmentation"

MIT License

392 stars 91 forks source link

question about GPU memory #34

Open bravelyw opened 1 year ago

bravelyw commented 1 year ago

First thanks for your contribution. In your introduction, you said you use a GPU with 12 GB, but I run the program in a GPU with 24 GB and use nearly 18 GB. I dont know why, can you help me? And the GPU Fan mabye is broken, but it is not the reason.

LIKP0 commented 1 year ago

Hello! I'm also following this project now but I can't run the code. Are you willing to share your email with me and have some debates?

xamyzhao commented 1 year ago

@bravelyw I'm sorry, I need more context to figure out your memory usage. Which models are you loading? Does this happen after you load the model, or only once training starts?

bravelyw commented 1 year ago

Hello! I'm also following this project now but I can't run the code. Are you willing to share your email with me and have some debates?

ok, share your email and I will send the code to you.

bravelyw commented 1 year ago

@bravelyw I'm sorry, I need more context to figure out your memory usage. Which models are you loading? Does this happen after you load the model, or only once training starts?

The flow-fwd model, and throughout the training phase.

LIKP0 commented 1 year ago

12232132@mail.sustech.edu.cn. Thank you! Actually I met this problem after I use the OASIS dataset, and I can't figure it out.

bravelyw commented 1 year ago

12232132@mail.sustech.edu.cn. Thank you! Actually I met this problem after I use the OASIS dataset, and I can't figure it out.

I have sent to you. And I guess mabye the dataset is not correct.

bravelyw commented 1 year ago

@bravelyw I'm sorry, I need more context to figure out your memory usage. Which models are you loading? Does this happen after you load the model, or only once training starts?

And I want to manually import the last traing result when OOM, but it seems no init_model_weights method in exp, how should I do?

xamyzhao commented 1 year ago

@bravelyw what batch size are you using? How large are your volumes?

You're right, looks like init_model_weights wasn't included in this repo. You should be able to implement your own init_model_weights function that uses standard model loading code from your desired path (e.g. https://www.tensorflow.org/guide/keras/save_and_serialize)