Open Junang-Wang opened 8 months ago
Hi @Junang-Wang
Thanks for reaching out!
As I understand, you are using a different model architecture to what we used in the paper. I cannot tell you exactly why you experience overfitting in your model definition. I suggest that you refer to the tensor flow implementation of the model at https://github.com/ethz-msrl/deep_fluids and reproduce results with that. You should be able to reimplement the model more or less exactly in Pytorch.
Did you use advanced network like generative adversarial network to increase the performance?
We didn't use any GANs in this research no.
Best of luck in your research endeavors!
Hi @samlcharreyron, Thank you for your response.
As I understand, you are using a different model architecture to what we used in the paper.
I am seeking clarification regarding the architecture outlined in the paper "Modeling Electromagnetic Navigation Systems." For my understanding, the paper describes a sequential architecture consisting of a projection layer, 4 conv3d layers with skip connections, an upsample layer, another 4 conv3d layers with skip connections and an upsample layer, and a final conv3d layer, as depicted in the provided figure.
Did I misunderstand something? I would appreciate it if you could help me identify the differences between this architecture and the one I am currently using. I have been stuck on this for some time.
Bests, WJA
The best help I can give you is for you to look at the repo I posted in my comment above. You will see there the direct implementation in code which is much more detailed than the description in the paper. Best of luck!
Ok, thank you for your help.
Hi, Dr. Samuel Charreyron
I hope this message can find you well. My name is JunAng Wang, a PosDoc from School of Physics at PeKing University . Our research group is recently focused on training generative networks to produce physical field, such as magnetic field and force fields. We have taken an interested in your paper titled "Modeling Electromagnetic Navigation Systems" and aim to replicate your result.
So we have developed a convolution neural network (CNN) and loss function identical to those detailed in your paper(plz refer to the attachment network.png). Our training data is sourced from the magnetic data available on the ETH network (https://www.research-collection.ethz.ch/handle/20.500.11850/408738) and we have implemented cosine decay learning rate ranging from 1e-3 to 1e-7. However our result is not as good as the one in the paper, the root mean square error (RMSE) stuck at 40mT after 500 epochs and we are observing signs of overfitting issue(plz see output.png and loss.png in attachments).
So I am seeking your expertise to address the following question:
Have we potentially made any errors in our approach?
Is there a possibility of misunderstanding key aspects of your paper?
Did you use advanced network like generative adversarial network to increase the performance?
Could you offer insights on how to resolve these issues?
Thank you for any help you can offer, JunAng Wang