Closed janxkoci closed 6 months ago
I found the problem and fixed it. I also merged the devp branch into the master branch, so it's fixed both places.
The old code was reading outside the bounds of an array, and the resulting garbage resulted in an assertion error. Thanks for bringing this to my attention.
On Tue, May 21, 2024 at 2:17 AM Jeňa Kočí @.***> wrote:
Hi Alan,
I'm running resid on simulated data (from msprime), to get relative site pattern frequencies of simulations under our models for comparison with observed data, but I'm getting the following error for two of the opf files:
$ resid Cs_data_opf.txt Cs_4_data_opf.txt > /dev/null # get only stderr resid: resid.c:654: main: Assertion `Dbl_near(mat[i * nDataFiles + j], 0.0)' failed.Aborted (core dumped)
Removing one of the files makes resid finish fine, same for using them individually. Only when they are used together they crash resid - no matter if just the two files or in a larger batch of all simulated bootstraps for a given model.
I cannot spot anything obviously wrong with the files.
One of the files is simulated using values of point estimates from legofit, while the other file is 4th simulation replicate where I sampled values from confidence intervals returned by legofit (to emulate bootstrap replicates). All other such replicates are fine, as well as other models and previous runs of the same pipeline.
Cs_data_opf.txt https://github.com/alanrogers/legofit/files/15387340/Cs_data_opf.txt (simulated using values of point estimates) Cs_4_data_opf.txt https://github.com/alanrogers/legofit/files/15387338/Cs_4_data_opf.txt (simulated using values sampled from confidence intervals, 4th replicate)
My resid is version 2.3.21-16-gc04221f6 - it may not be the latest, but I use this version for the entire project, and it was without issues until now. Also, the OS on the particular cluster had been recently updated to Debian 12 - I don't know if this is relevant, especially if my other runs of resid are fine.
Can you please help me figure out where the problem is?
In the mean time, is it okay to produce the relative frequencies separately for these files and then combine them into one file for downstream analysis, or do the frequencies change based on which files are included? (From a brief eyeballing of the outputs this does not seem to be the case, so my hunch is I can combine them.)
— Reply to this email directly, view it on GitHub https://github.com/alanrogers/legofit/issues/17, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRR6ST6SYE37GDXHUHAYE3ZDMGI5AVCNFSM6AAAAABIBHKJP6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGMYDONZZGYYDENI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks Alan, I'll test it later today and will let you know how it works.
OK it works well now, thanks again! :relaxed:
I just made a few tweaks to the devlp branch, merged that into master, and bumped the version number to 2.3.22. I forgot to mention yesterday that the Makefile now specifies the clang compiler rather than gcc.
On Wed, May 22, 2024 at 12:42 AM Jeňa Kočí @.***> wrote:
OK it works well now, thanks again! ☺️
— Reply to this email directly, view it on GitHub https://github.com/alanrogers/legofit/issues/17#issuecomment-2124089495, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRR6SU5FWHUIHDRRBHFHCDZDRD5ZAVCNFSM6AAAAABIBHKJP6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRUGA4DSNBZGU . You are receiving this because you commented.Message ID: @.***>
Oh yeah, I noticed I had to module add clang
, I thought it was a Debian 12 thing..
Hi Alan,
I'm running
resid
on simulated data (frommsprime
), to get relative site pattern frequencies of simulations under our models for comparison with observed data, but I'm getting the following error for two of theopf
files:Removing one of the files makes
resid
finish fine, same for using them individually. Only when they are used together they crashresid
- no matter if just the two files or in a larger batch of all simulated bootstraps for a given model.I cannot spot anything obviously wrong with the files.
One of the files is simulated using values of point estimates from
legofit
, while the other file is 4th simulation replicate where I sampled values from confidence intervals returned bylegofit
(to emulate bootstrap replicates). All other such replicates are fine, as well as other models and previous runs of the same pipeline.Cs_data_opf.txt (simulated using values of point estimates) Cs_4_data_opf.txt (simulated using values sampled from confidence intervals, 4th replicate)
My
resid
is version2.3.21-16-gc04221f6
- it may not be the latest, but I use this version for the entire project, and it was without issues until now. Also, the OS on the particular cluster had been recently updated to Debian 12 - I don't know if this is relevant, especially if my other runs ofresid
are fine.Can you please help me figure out where the problem is?
In the mean time, is it okay to produce the relative frequencies separately for these files and then combine them into one file for downstream analysis, or do the frequencies change based on which files are included? (From a brief eyeballing of the outputs this does not seem to be the case, so my hunch is I can combine them.)