matthewwolak / nadiv

R package that constructs (non)additive genetic relationship matrices, and their inverses, from a pedigree to be used in linear mixed effect models (A.K.A. the 'animal model').
17 stars 7 forks source link

minv use of =- #32

Closed matthewberryman closed 3 years ago

matthewberryman commented 3 years ago

Clang throws a warning here: https://github.com/matthewwolak/nadiv/blob/71a843ae5f61bf3f420c500387fe69be9645e144/src/minv.cc#L276 From a read through the papers describing the algorithms (specifically https://www.genetics.org/content/genetics/early/2008/07/27/genetics.108.088070.full.pdf p24) it seems to me it is a correct warning, i.e. should be -= in order to iterate through j properly.

In looking for that I also came across this other =-: https://github.com/matthewwolak/nadiv/blob/71a843ae5f61bf3f420c500387fe69be9645e144/src/minv.cc#L226 which is meant to be, though again spacing would help, and though I haven't yet delved into that part of the maths, just noting that the code (-I) differs from the comment (0).

@lmmazur @esscargot

matthewwolak commented 3 years ago

Thanks @matthewberryman! Sorry for the delay - you are right the spacing is making a mess of everything and the comments are not as precise as they could have been.

Everything was working correctly despite what you have noted, but more out of luck than anything. I've resolved this issue by adding spaces, adding more detailed/accurate comments, and standardizing these lines across the several different files that implement this algorithm (e.g., for the M^-1, the "regular" A^-1, the sex-linked S^-1, etc.). There are several changes that fix these issues, but probably the most concise/pertinent are commits: c4d49d718d9484930a800a03561d84861011fae5 and 1e3324b3e1760f7150015ba390f996d407ca4e65, though with a bunch of boring spaces added.

In the first instance you noted above, interpreting as j = -1 is what is needed for correctly assigning j's value as essentially "empty". Same thing with the second instance for AN[k]. Basically, all individuals are represented by positive integers, with 0 as the lowest identity number. The algorithm moves through AN to determine if entries placed in there represent individuals from a more recent generation than the pedigree base. So, by assigning the "empty" values as negative numbers the logical:

https://github.com/matthewwolak/nadiv/blob/71a843ae5f61bf3f420c500387fe69be9645e144/src/minv.cc#L279

will begin by assigning to j any real identity (0 or greater) and continue to replace this as more recently born individuals are found in AN.

Thanks for noting this issue and "making" me improve the code!

Matthew