AndreWeiner / ml-cfd-lecture

Lecture material for machine learning applied to computational fluid mechanics
GNU General Public License v3.0
303 stars 109 forks source link

Exercise 7/8 #29

Closed AndreWeiner closed 1 year ago

AndreWeiner commented 1 year ago

Hi @JanisGeise, I finally managed to update the exercise notebook for the upcoming two exercises. I also updated the code in the lecture notebook, but I didn't have time to add explanations yet. Please let me know if the exercises work for you. Thanks! Andre

JanisGeise commented 1 year ago

Hi @AndreWeiner,

exercise 7 worked without any problems (I used the provided data set instead of performing a simulation). I will take a look at exercise 8 tomorrow.

Regards, Janis

JanisGeise commented 1 year ago

Hi @AndreWeiner,

exercise 8 worked up to the point of running the single phase simulation in openfoam. After exporting the models and compiling the solver, I get the following error message from scalarPimpleFoam:


Traceback of TorchScript, original code (most recent call last):
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/linear.py(114): forward
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1130): _call_impl
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/container.py(139): forward
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1130): _call_impl
/home/janis/.../ml_in_cfd_ue7_8.py(84): forward
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1130): _call_impl
/home/janis/.../ml_in_cfd_ue7_8.py(98): forward
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1118): _slow_forward
/home/janis/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1130): _call_impl
/home/janis/.local/lib/python3.8/site-packages/torch/jit/_trace.py(967): trace_module
/home/janis/.local/lib/python3.8/site-packages/torch/jit/_trace.py(750): trace
/home/janis/.../ml_in_cfd_ue7_8.py(365): <module>
RuntimeError: expected scalar type Float but found Double

the error message comes from the forward method of the rv_model. In the script, line /home/.../ml_in_cfd_ue7_8.py(365): <module> , the trace is created as:

rv_trace = pt.jit.trace(rv_model.eval(), example_inputs=pt.rand(2))

/home/janis/.../ml_in_cfd_ue7_8.py(98): forward is the forward method of the RVModel class,

/home/janis/.../ml_in_cfd_ue7_8.py(84): forward is the forward method of the RVModelNorm class.

All models are save into ../notebooks/output/ as:

    tv_model = TVVelocity(best_tv_model_norm, tv_min[0], tv_max[0], tv_min[1], tv_max[1])
    rad_model = Rad(best_rad_model_norm, rad_min[0], rad_max[0], rad_min[1], rad_max[1])
    rv_model = RVModel(best_rv_model_norm, rv_min, rv_max, wt_min, wt_max)

    with pt.no_grad():
        rv_trace = pt.jit.trace(rv_model.eval(), example_inputs=pt.rand(2))

    rv_trace.save(join(save_path, "rv_model.pt"))
    tv_trace = pt.jit.trace(tv_model, example_inputs=pt.rand((3, 2)))
    tv_trace.save(join(save_path, "tv_model.pt"))
    rad_trace = pt.jit.trace(rad_model, example_inputs=pt.rand((3, 2)))
    rad_trace.save(join(save_path, "rad_model.pt"))

I'm using the code implemented in the lecture notebook. Switching to the the virtual environment created in the first exercise leads to the same error message, so I assume this is unrelated to the version of PyTorch used. Changing

with pt.no_grad():
        rv_trace = pt.jit.trace(rv_model.eval(), example_inputs=pt.rand(2))

to

rv_trace = pt.jit.trace(rv_model, example_inputs=pt.rand(2))

as for the other models didn't have any effect.

Regards, Janis

AndreWeiner commented 1 year ago

Hi Janis, thanks for the feedback. I guess you copied the code snippets from the lecture notebook to a different script/notebook, correct? I assume the line

# set default dtype to double precision
pt.set_default_dtype(pt.float64)

might be missing because the PyTorch default is 32 bit. If the models are traced with single precision, the C++ code will throw an error because it expects doubles from the model. If this change resolves the issue, I add a comment to the exercise notebook. Best, Andre

JanisGeise commented 1 year ago

yes, I forgot to set the correct dtype, now everything works without problems. Thanks for the fast reply.

Regards, Janis