FluidityProject / fluidity

Fluidity
http://fluidity-project.org
Other
365 stars 115 forks source link

Segfault in square-convection-parallel-trivial #317

Closed stephankramer closed 3 years ago

stephankramer commented 3 years ago

This test segfaults in the last adapt when trying to pack data for detector migration in zoltan Backtrace:

#0  0x000055b2ee6ed510 in detector_tools::pack_detector (detector=0x55b2f0ac7460, buff=..., ndims=<optimized out>, 
    nstages=<error reading variable: Cannot access memory at address 0x0>, attribute_size=...) at Detector_Tools.F90:452
#1  0x000055b2eed4a365 in zoltan_callbacks::zoltan_cb_pack_fields (data=..., num_gid_entries=<optimized out>, num_lid_entries=<optimized out>, num_ids=<optimized out>, 
    global_ids=..., local_ids=..., dest=..., sizes=..., idx=..., buf=..., ierr=0) at Zoltan_callbacks.F90:1101

which is

buff(ndims+det_params+1:ndims+det_params+attribute_size(1)) = detector%attributes

Looks like the lhs is fine - it's actually the detector%attributes that is unallocated (print *, allocateddetector%attributes print F). This is also flagged by valgrind first in line 56 of Zoltan_detectors.F90:

   ! loop through all registered detector lists
    call get_registered_detector_lists(detector_list_array)
    do det_list = 1, size(detector_list_array)

       ! search through all the local detectors in this list
       detector => detector_list_array(det_list)%ptr%first
       !Set up particle attribute parameters
       total_attributes=0
       if (associated(detector)) then
          if (size(detector%attributes)>=1) then

So the detector appears associated, but asking for the size of detector%attributes is flagged by valgrind as using uninitialised memory.

stephankramer commented 3 years ago

@angus-g : after discussing this with @drhodrid this morning and having another look, I think the issue is that if a detector (so not a particle) is moved between processes (due to parallel mesh adaptivity) it doesn't always allocate the %attributes array. This array is allocated to zero size in create_single_detector, but for instance lines 2049-2050 of Zoltan_integraion.F90 a detector is created without its %attributes being allocated neither in the allocate or in the unpack_detector call. Then when this detector needs to move again, in zoltan_cb_pack_fields it expects every detector (and particle) to have an allocated %attributes because on line 1084 of Zoltan_callbacks.F90 it checks for the size of the array, which of course is undefined if it is not allocated.

I'm not entirely sure where the most logical place is to put the allocate - since you're more familiar with the code would you mind having a look? To reproduce you can run the test in the fluidity/actions:bionic-9c94ec79a1e4dcc68f1c82b481b36571f9f11d1d docker container