Closed lemieuxl closed 2 years ago
Thank you for pointing out the issues. We have fixed it and pushed the changes to the dev branch. And we will release a new version soon. Please let us know if there are any issues.
Version 1.4.4 has been released.
Awesome software!
We found an issue when, at the same time, we include a list of variants (
--include-snp-file
) and we filter according to MAF (--maf 0.01
, for example) (genotypes are in BGEN format).When reading the BGEN file, as soon as a variant gets filtered out because of MAF, no more variants are processed. Hence, if the fourth variants (from a list of 300k) has a lower MAF then the threshold, there are only three variants in the results file.
The following condition will always be
True
because of the second part (i.e.keepVariants[keepIndex] + 1 != snploop
). https://github.com/large-scale-gxe-methods/GEM/blob/fc773b60f7bf60eee4d15ae8959e08d641392555/src/ReadBGEN.cpp#L927We think that the fix could be to increment the
keepIndex
counter in the following block. https://github.com/large-scale-gxe-methods/GEM/blob/fc773b60f7bf60eee4d15ae8959e08d641392555/src/ReadBGEN.cpp#L1112I could do a pull request if you want, but I'm unsure if other counters should also be incremented (e.g.
stream_i
).