jianyangqt / gcta

GCTA software
GNU General Public License v3.0
73 stars 23 forks source link

--fastGWA-mlm-binary: segmentation fault at sparse grm reading step #79

Open WeiCSong opened 1 month ago

WeiCSong commented 1 month ago

Hi, i'm running gcta with the foloowing options

~/gcta/gcta64 --bfile intput --grm-sparse ~/gcta/sczarray --fastGWA-mlm-binary --pheno ~/SCZ/phen --qcovar ~/SCZ/covar --thread-num 10 --memory 80000 --out ~/pacbio/gctares/$basename

and i got segmentation fault when gcta tried to read the sparse grm file. From the log file this sparse grm is generated with no error, and the size is normal (~5m for 26k samples). adding threads and memory (up to 200G) does not help. Do you have any suggestion on this issue? Thank you very much for your help.

longmanz commented 1 month ago

Hi, Do you mean that there are ~5m lines in your .grm.sp file? This is not expected for a 26k dataset. Even for the UK Biobank based on our calculation the number of lines in the .grm.sp is ~ 600k (restricted to European-ancestry participants only).

Please can you check the following things to make sure the sparse grm is correctly calculated?

  1. Are all the individuals of the same/similar genetic ancestry ? (individuals from a different ancestry or Admix individuals should be removed)
  2. Are you using HapMap3 common SNPs (minor allele frequency >= 0.01) to generate the GRM? (rare SNPs should not be used and also make sure the HapMap3 SNP list is from the corresponding ancestry of your data.
  3. Are you using a sparse grm cutoff of 0.05? (setting the cutoff to values lower than 0.05 will increase the number of related pairs/rows in your .grm.sp file)