SWIFTSIM / SWIFT

Modern astrophysics and cosmology particle-based code. Mirror of gitlab developments at https://gitlab.cosma.dur.ac.uk/swift/swiftsim
http://www.swiftsim.com
GNU Lesser General Public License v3.0
88 stars 58 forks source link

Cooling issue running EAGLE 6-100 Mpc box examples #14

Closed FilipHusko closed 3 years ago

FilipHusko commented 3 years ago

Hi,

When I run any of the EAGLE cosmological examples (in the '/examples/EAGLE_ICs' folder) with EAGLE cooling, I get the following error: 'cooling/EAGLE/cooling.c:cooling_get_subgrid_temperature():722: This cooling model does not use subgrid quantities!'.

The EAGLE runs (e.g. 6 Mpc box) work fine up until time-step 376 (redshift 16.33). This is after the first few FoF searches, and after the first snapshot is written out. I tried the same EAGLE cooling configuration with the 'FeedbackEvent_3D' cooling example, as well as the 'IsolatedGalaxy_feedback' one. They both worked fine. I also tried the EAGLE cosmological box examples with Colibre cooling; those complete successfully.

As a note, I am using the standard modules and options listed here: https://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/wikis/COSMA-build. I'm running the examples on cosma7 nodes.

JBorrow commented 3 years ago

Hi Filip, you need to run with the EAGLE-XL model (./configure ...... --with-subgrid=EAGLE-XL). We should indeed clear this up, though. You will also need the COLIBRE cooling tables to run the current model.

FilipHusko commented 3 years ago

Hi Josh, thanks for the reply. Running with the options you mentioned does indeed work. However, I'm trying to run the older model, by using --with-subgrid=EAGLE and using the EAGLE cooling tables (which I download using the scripts in the provided examples). That's when I get the error. Is the older not runnable with the current version of SWIFT?

JBorrow commented 3 years ago

Is there a reason you are trying to run that older model?

You may be able to revert to the older behaviour by setting this parameter to 0:

https://github.com/SWIFTSIM/swiftsim/blob/master/examples/parameter_example.yml#L587

FilipHusko commented 3 years ago

I'm primarily trying to use it to reduce computational time, since I'm under the impression it might be a few times quicker. Is this not the case?

The 'use_subgrid_Bondi' parameter is already set to 0 in the parameter file. I'm going to try setting the 'SF_threshold' parameter to something other than 'Subgrid'.

JBorrow commented 3 years ago

No, there is no reason to believe that the older cooling tables would make the simulations faster. You should use the new cooling tables, as we will not use the old ones ever again in a production run. The sub-grid properties are also key in the physics.

FilipHusko commented 3 years ago

Oh, okay then. This is then a separate issue, but do you have an idea why an EAGLE 25 box is taking 80-ish hours to run on 2-4 cosma 7 nodes? I'm using the following configure options: ./configure --with-subgrid=EAGLE-XL --with-hydro=sphenix --with-kernel=wendland-C2 --enable-task-debugging --with-tbbmalloc --with-parmetis --enable-ipo and the following run command: mpirun -np 4 /.../swift_mpi --cosmology --eagle --threads=14 --pin eagle_25.yml (e.g. for 4 tasks, with N=2 cores and --tasks-per-node=2). I'm using the following modules: module load intel_comp/2020-update2 /n intel_mpi/2020-update2 ucx/1.8.1 parmetis/4.0.3-64bit parallel_hdf5/1.10.6 fftw/3.3.8cosma7 gsl/2.5 llvm/10.0.1 This is all in line with the recommended options as far as I have been able to find them. I have also tried setting the following: FI_OFI_RXM_RX_SIZE=4096 FI_OFI_RXM_TX_SIZE=4096 FI_UNIVERSE_SIZE=2048 without any effect.

MatthieuSchaller commented 3 years ago

If you are using a star formation model that makes use of subgird quantities then you need to use the new Ploeckinger+20 tables (COLIBRE cooling, EAGLE-XL subgrid model).

If you use a SF model based on the ddensity and metallicity (as in the older models) then you can use either the Wiersma+09 (EAGLE cooling) or Ploeckinger+20 (EAGLE-XL cooling) model.

The COLIBRE cooling is marginally slower than the EAGLE one but that will lead to <1% difference in the overall runtime of a simulation.

MatthieuSchaller commented 3 years ago

Is that the 25Mpc box starting from the ICs? Then 80 hours is great. :)

FilipHusko commented 3 years ago

Yes, that's the one. Great to hear that that's the usual runtime! I was looking at some older discussions online where the same box with the full model apparently took 20 or so hours (https://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/issues/584). That example used 8 cosma7 nodes, but I seem to find that using more than 2 leads to no speedup of the run. I was also somewhat confused by my run having ~10^6 timesteps, which seemed excessive.

MatthieuSchaller commented 3 years ago

~800k time-steps is reasonable for a not very well calibrated EAGLE-like model in a 25Mpc box.

Also, running on more than 1 node is a bit of a waste of ressources since you will then be using <5% of the memory.

FilipHusko commented 3 years ago

Alright, that clears things up. Thanks for the help!