Closed apoliakov closed 2 years ago
Sorry fo the late reply! We have just released a new version 1.0.0. It has substantial computational efficiency improvements for both Step 1 and Step 2 for single-variant and set-based tests and clearer log output. We have created a new program github page https://github.com/saigegit/SAIGE with the documentation provided https://saigegit.github.io/SAIGE-doc/ The program will be maintained by multiple SAIGE developers there. The docker image has been updated. Please feel free to try the version 1.0.0 and report issues if any.
Thanks! Wei
Hey guys!
When running SAIGE-Gene on about 400K samples we are getting very big log files. Mostly it's these lines: https://github.com/weizhouUMICH/SAIGE/blob/a3fc49cfb651bb19878a99457abbc1e2261443d5/src/SAIGE_readDosage_bgen.cpp#L536 https://github.com/weizhouUMICH/SAIGE/blob/a3fc49cfb651bb19878a99457abbc1e2261443d5/src/SAIGE_readDosage_bgen.cpp#L695
They seem to appear to happen a few times, over and over. Just running a handful of genes you can generate 1.4GB worth of log. Probably grows with the number of analyzed samples and high missingness. We save all the log files, so now if we do a whole set of genes, we have to worry about shuffling around gigabytes of log. Spitting out so much output probably hurts your performance too. It could make SAIGE-Gene slower - even if you don't save it to file.