daschaich / susy

Codes for supersymmetric lattice gauge theories
GNU General Public License v3.0
15 stars 6 forks source link

Bug report: Three Segmentation Fault Bugs #15

Closed westwind2013 closed 6 years ago

westwind2013 commented 6 years ago

Hi there!

In my recent use of this program, I found three segmentation fault bugs due to the inappropriate use of memory allocation routine malloc(). Below I list a verified fix for each:

  1. Line 131 in susy/4d_Q16/susy/update_o.c _TwistFermion *src = malloc(Nroot sizeof(**src)); should be changed to _Twist_Fermion *src = malloc(Nroot sizeof(TwistFermion *));

  2. Line 13 in susy/4d_Q16/susy/update_o.c _TwistFermion *psim = malloc(Nroot sizeof(psim)); should be changed to _Twist_Fermion psim = malloc(Nroot sizeof(Twist_Fermion ));_

  3. Line 44 in susy/4d_Q16/susy/congrad_multi.c _TwistFermion *pm = malloc(Norder sizeof(**pm)); should be changed to _Twist_Fermion *pm = malloc(Norder sizeof(TwistFermion *));

Thank you!

westwind2013 commented 6 years ago

A further checking indicates other similar bugs relating to the use of malloc.

daschaich commented 6 years ago

Thanks very much for the report! I may not have a chance to address it until I get back from a conference on 15 January.

In the meantime, I'm curious what compiler you're using, as obviously this isn't an issue we have encountered with any of those we use (mainly various versions of gcc, with either mpich2 and mvapich MPI implementations).

In a standards-compliant compiler I would expect the two possibilities you write in each case to be handled identically. The style currently in the code (dereferencing the pointer rather than repeating the type) is typically advocated in order to improve the maintainability of the code. See here for one example, which also suggests moving the sizeof to the front. That may be worth testing, especially considering the apparent integer arithmetic issue suggested by your other report. This should look something like Twist_Fermion **src = malloc(sizeof **src * Norder); and so on.

westwind2013 commented 6 years ago

Thank you for your reply. Moving the sizeof to the front is good. But I am worried that the coding style as below

_TwistFermion src = malloc(sizeof src * Norder);

still has a bug in it. As src is a pointer to data type _"TwistFermion *". When we do the malloc(), we hope to malloc a memory space of size "sizeof (src) Norder" instead of "sizeof (*src) Norder" --- the former indicates Norder elements of data type _"TwistFermion *" while the latter indicates Norder elements of data type _"TwistFermion".

Btw, I am actually working on a project that aims to provide an automated testing tool for MPI programs. I use this program as a target to do some analysis and happen to find these bugs. I don't know well enough the program actually.

daschaich commented 6 years ago

_Norder elements of data type "TwistFermion *" is exactly what we want for the RHMC algorithm, which involves Norder source vectors, each of which is malloc'd with sites_on_node elements a few lines further down. In the end src is an array of Norder pointers to arrays of sites_on_node Twist_Fermions, with the individual structs accessed via src[N][i] for the components of source vector N associated with lattice site i.

Anyway, it sounds like moving the sizeof to the front resolves the problem you encountered, so I quickly did that in commit e2d120b and will close this issue as well. Feel free to reopen if problems remain.

westwind2013 commented 6 years ago

If we want to add "Twist_Fermion *", should we use

Twist_Fermion *src = malloc(sizeof src * Norder);

instead of

Twist_Fermion src = malloc(sizeof src * Norder);

?

daschaich commented 6 years ago

Okay, I'm finding the dereferencing hard to parse, so I scrapped it for the sake of readability. Commit 53adcdc essentially implements your original suggestion. Valgrind seems happy, at least in serial.

westwind2013 commented 6 years ago

That's great. To be honest, I don't know much about physics, and thus just look the program from the perspective of the programmer.

Thank you for your work:)