Open jodavies opened 1 month ago
I added some print statements to PF_UnpackRedefinedPreVars
to try to work out what happens. There are many successful redefines, and then we have:
0 i = 0
0 trying to redefine ik1 (35) to 1
0 loop j = 0; j < 2
0 AC.pfirstnum[0] (ik1c) (37), index 35
0 AC.pfirstnum[1] (adj) (38), index 35
so it fails to find the variable it is trying to redefine in pfirstnum
. So then it evaluates
if ( AC.inputnumbers[j] < inputnumber ) {
for j=2
, causing the Conditional jump or move depends on uninitialised value(s)
.
So, why is ik1
no longer in the AC.pfirstnum
array? Earlier in the program it was there (and had index 35).
Thanks for the investigation. I will look into the ParFORM issue. (You know, in programming, the person you were a month ago is a stranger. Then, the person you were more than 10 years ago is...)
By the way, maybe this (the code that gives Valgrind error) should be broken up into small unit tests.
It seems the problem with valgrind and tvorm -w2
is due to the load balancing. The same issue happens with -w3
. If I run this test under callgrind, we see that ThreadsProcessor
makes 100s of millions of calls of LoadReadjusted
(which is stealing terms from the working thread and distributing them around the idle threads) which also involve locks.
With w4
, there are only ~400 calls of LoadReadjusted
.
Edit: it is some kind of race condition though it seems: if I add a MesPrint in LoadReadjusted
it prints only ~400 times, even when running under valgrind.
The easiest solution is to just disable valgrind for this test...
Once #525 is merged we can rebase this on top and the parform tests will run successfully also.
This test seems trickier than expected. Currently:
mpirun -np {1,2} parform
: OKmpirun -np {3,4} parform
: hang?mpirun -np {5,6} parform
: crashvalgrind vorm
: OKvalgrind tvorm -w2
: mostlyhangstakes 100-200s, sometimes finishes in 2svalgrind tvorm -w4
: OK, finishes in 2sThe CI sees valgrind errors in
vorm
,tvorm
that I can't reproduce locally. Edit: I can reproduce them on ubuntu 20.04 (as the runners are using) but not in 22.04.