svenreiche / Genesis-1.3-Version4

Time-dependent, 3D Code to simulate the amplification process of a Free-electron Laser.
GNU General Public License v3.0
54 stars 27 forks source link

[BUG] with wakes: crash in '&track' command following harmonic upconversion #120

Closed ZeugAusHH closed 9 months ago

ZeugAusHH commented 10 months ago

Verified with current status of 'dev', that is git commit ID 9b788bb316f1f62a9a234e28d954fcbd137332dc (2023-10-09, 09:25:13).

The reason is that the array sizes in class Collective were allocated for the number of slices at the fundamental. After the harmonic upconversion the number of beam slices is increased, but the data structures are still for the original number of slices. If there is no additional &wake block between &alter_setup and &track, the result is a crash in Collective::update during the first steps of &track. There were various symptoms: sometimes heap corruption, sometimes memory corruption (in the core dumps), sometimes just crashes deep in the MPI libs, all likely caused by the MPI_Allgather operation.

Please see the attached minimum working example demonstrating the issue (I ran it with mpisize=32). In my branch https://github.com/ZeugAusHH/Genesis-1.3-Version4/tree/cl_20231107__wakememissue (please do NOT merge at the time being), I added some diagnostics code that calls abort() before the MPI_Allgather operation would write beyond the end of the data array. After a verification simulation I would create a pull request. Update (2023-11-07T1600): This code is now available and ready for merging in the new pull request https://github.com/svenreiche/Genesis-1.3-Version4/pull/121 .

My proposal would be the following:

Please note that I have not tested other operations that modify the slice count.

ZeugAusHH commented 10 months ago

Here the minimum working example:

20231107__wakesissue_mwe.tar.gz

svenreiche commented 9 months ago

In case of defined wakes and a resampling of the electron distribution the code now issues a warning and discards the current definition. Since this is done with an alter_setup command the wake command can be executed right after.