cholla-hydro / cholla

A GPU-based hydro code
https://github.com/cholla-hydro/cholla/wiki
MIT License
66 stars 32 forks source link

Thread crashes in PPMC simulations with cooling #312

Open helenarichie opened 1 year ago

helenarichie commented 1 year ago

The recent changes in the VL integrator (i.e. beyond this commit) are causing a large number of thread crashes in the wind tunnel simulations that I'm running. They originate early on in the simulation at the boundary of the cloud-wind shock and propagate out until the simulation volume is filled up with NaNs. After playing around with various simulation setups to see what reproduced the issue, I discovered that a hydro-only simulation (i.e. with -DCUDA -DMPI_CHOLLA -DPRECISION=2 -DPPMC -DHLLC -DVL -DOUTPUT -DHDF5 build flags) works fine (the simulation did have ~15 thread crashes per timestep at the beginning, but that stopped after ~50 timesteps). However, when I turned on CIE cooling (which I use in the wind tunnel simulations) I'm able to reproduce the large number of thread crashes that I was originally seeing. I also found that I could get the hydro-only simulation to run (with no thread crashes at all) by switching to PPMP.

helenarichie commented 1 year ago

I should also note that I tried using the simple integrator with a PPMC reconstruction, which also led to a large number of thread crashes.

I'm attaching a parameter file and the lines of code in the Clouds() initial conditions function and Wind_Boundary_kernel() boundary condition function that can be used to recreate this issue.

Clouds(): Line 1320: Real R_cl = 0.005; Line 1335: cl_pos[nn][0] = 0.1 * H.xdglobal; Line 1341: n_bg = 1e-2; Line 1342: n_cl = 1; Line 1345: vx_bg = 1000*TIME_UNIT/KPC;

Wind_Boundary_kernel(): Line 321: vx = 1000 * TIME_UNIT / KPC;

Build flags: -DCUDA -DMPI_CHOLLA -DPRECISION=2 -DPPMC -DHLLC -DVL -DCOOLING_GPU -DOUTPUT -DHDF5

cloud-wind.txt

evaneschneider commented 1 year ago

Did the previous version have thread crashes when cooling was turned off (for pure hydro)? Or equivalently, does the simulation run with PPMP have thread crashes? If not, it seems likely that the cause is in the updated PPMC methods, not in cooling, and that cooling is just exacerbating the problem that is leading to the thread crashes in the pure hydro case.

evaneschneider commented 1 year ago

Also, are you running with density and temperature floors?

evaneschneider commented 1 year ago

I am not sure where the bug is, but I have determined that an asymmetry appears in the Einfeldt strong rarefaction test (123.txt in the 1D examples directory, although I had to run it in 3D because of the other bug I identified) with the current version of PPMC that is NOT present in the earlier commit that Helena identified. This is running with plain hydro, i.e. build flags -DPPMC -DHLLC -DVL.

helenarichie commented 1 year ago

Did the previous version have thread crashes when cooling was turned off (for pure hydro)? Or equivalently, does the simulation run with PPMP have thread crashes? If not, it seems likely that the cause is in the updated PPMC methods, not in cooling, and that cooling is just exacerbating the problem that is leading to the thread crashes in the pure hydro case.

There were no thread crashes in the simulation runs with the previous version or the current version with PPMP.

helenarichie commented 1 year ago

Also, are you running with density and temperature floors?

I don't think I tried a density floor at any point, but I did use a temperature floor, and that did not prevent thread crashes.

bcaddy commented 1 year ago

I'm not sure I'm seeing the same asymmetry in the hydro Einfeldt strong rarefaction. Here's a per time step animation of a central strip, the parameter file I used, and the make.type.hydro file; run on commit 51aba26c

123.txt make.type.hydro.txt

(ignore the magnetic fields. I ran this in hydro mode so the B fields don’t actually exist, I just didn’t remove them from the plotting script)

https://github.com/cholla-hydro/cholla/assets/41171425/c7831f31-cdfc-483b-8fe7-3e92ddc76fa5

evaneschneider commented 1 year ago

You're right, I forgot that I had modified the CFL number. Helena, does the issue start behind the cloud? Sometimes the cloud-wind setups can be quite sensitive to the initial vacuum created behind the cloud, so it could be that the slight modification Bob made to the reconstruction is triggering a problem that was getting smoothed over in the past. Are the thread crashes from a negative density?

helenarichie commented 1 year ago

The thread crashes have negative densities and energies. And it's hard to tell exactly where they're originating from, but here's a slice from after the first timestep. It sort of looks like they're coming from the sides of the cloud, not behind it.

1_slice

bcaddy commented 9 months ago

Resolved in #361

bcaddy commented 9 months ago

Partially resolved in #361. Still some thread crashes in PPMC

bcaddy commented 8 months ago

@helenarichie, has this been fixed with the changes in #361?