Closed zhixwang closed 1 year ago
This seems to be a difficult problem to solve practically.
We also use the method you said, when CUDA out of memory. We have not tried the followings, but there is a method to divide 3D field into several sections, perform a new simulation according to each section, and then accumulate the gradient. However, this method will require additional simulation time. I don't know if it is possible, but what I think is a more efficient way is to keep the gradient relationship from the device to the S-parameter even after backward gradient calculation from the field, using the retain_graph parameter PyTorch, etc. Then, perform aforementioned method, dividing 3D field and accumulate the gradient.
Thank you.
Thanks for your answer and maintenance of this project.
I am also looking forward to the support for extracting xyz field (https://github.com/kch3782/torcwa/issues/4).
Best wishes
Hi authors, first I want to thank you again for the great work.
One issue I quite often have, is that when solving optimization problems, CUDA easily gets out of memory.
In my case, my optimization merit takes the 3D field as input. This means that I need to compute the 3D field, and retain the corresponding gradients as well. Because the highest memory in consumer GPUs is 24GB, and multi-GPU is not supported yet, when the meshing gets higher or diffraction orders get large, CUDA gets out of memory.
Is there any suggestion on your side to deal with this issue?
Many thanks!