choishingwan / PRS-Tutorial

A tutorial on how to run basic polygenic risk score analysis
MIT License
68 stars 104 forks source link

Error in calculating LD #10

Open Qiaolan opened 3 years ago

Qiaolan commented 3 years ago

I got the error below when it was calculating LD. I try to figure it out, but I can't even tell what the error message means. My R code for calculating LD is the same as yours in the tutorial.

corr0 <- snp_cor( genotype, ind.col = ind.chr, ncores = NCORES, infos.pos = POS2[ind.chr], size = 3 / 1000 ) corr <- bigsparser::as_SFBM(as(corr0, "dgCMatrix"))

Thank you!

Error in validityMethod(as(object, superClass)) : long vectors not supported yet: ../../src/include/Rinlinedfuns.h:522 Calls: snp_cor ... validObject -> anyStrings -> isTRUE -> validityMethod Execution halted

Qiaolan commented 3 years ago

After I filtered the genotye by plink (--geno 0), the error disappeared but I got "'h2' should have only positive values".

I have two questions:

  1. Can an individual have missing genotypes?
  2. For the plink bed file, is it possible to read by each chromosome instead of one big bed file? Like how PRsice handle this situation by adding "chr#".

Thanks for your help!

choishingwan commented 3 years ago

Which software are you using? Which part of the tutorial?

Sam

On Thu, 1 Oct 2020 at 7:19 AM, QiaolanOSU notifications@github.com wrote:

After I filtered the genotye by plink (--geno 0), the error disappeared but I got "'h2' should have only positive values".

I have two questions:

  1. Can an individual have missing genotypes?

  2. For the plink bed file, is it possible to read by each chromosome instead of one big bed file? Like how PRsice handle this situation by adding "chr#".

Thanks for your help!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRS-Tutorial/issues/10#issuecomment-701696452, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYWXG4V4IUPT4KKRMVDSIO4INANCNFSM4R7RKC4A .

-- Dr Shing Wan Choi Postdoctoral Fellow Genetics and Genomic Sciences Icahn School of Medicine, Mount Sinai, NYC

Qiaolan commented 3 years ago

Which software are you using? Which part of the tutorial? Sam On Thu, 1 Oct 2020 at 7:19 AM, QiaolanOSU @.***> wrote: After I filtered the genotye by plink (--geno 0), the error disappeared but I got "'h2' should have only positive values". I have two questions: 1. Can an individual have missing genotypes? 2. For the plink bed file, is it possible to read by each chromosome instead of one big bed file? Like how PRsice handle this situation by adding "chr#". Thanks for your help! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#10 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYWXG4V4IUPT4KKRMVDSIO4INANCNFSM4R7RKC4A . -- Dr Shing Wan Choi Postdoctoral Fellow Genetics and Genomic Sciences Icahn School of Medicine, Mount Sinai, NYC

Hi Dr. Choi!

Sorry about the ambiguity. I am using LDpred2. I followed the code of ldpred 2 tutorial (https://choishingwan.github.io/PRS-Tutorial/ldpred/) in "4. Start running LDpred 2". But I used my own dataset.

Thanks!

best,

Qiaolan

choishingwan commented 3 years ago

Hi,

Unfortunately, I am not the author of LDpred so I can only answer some of the questions. You might need to ask the author here: https://github.com/privefl/bigsnpr

  1. You might want to use more stringent --geno filtering. It is always a good idea to do QC before any PRS analysis
  2. What's your sample size? If your sample size is small, and if there's a slight population mismatch, then it is possible for you to get h2 estimate less than 0.
  3. I am not sure how LDpred2 handle missing genotypes
  4. You can achieve the per-chr file by putting

snp_readBed("EUR.QC.bed")# now attach the genotype object obj.bigSNP <- snp_attach("EUR.QC.rds")

and any code involves the use of obj.bigSNP inside the loop

Sam

Qiaolan commented 3 years ago

Hi, Unfortunately, I am not the author of LDpred so I can only answer some of the questions. You might need to ask the author here: https://github.com/privefl/bigsnpr 1. You might want to use more stringent --geno filtering. It is always a good idea to do QC before any PRS analysis 2. What's your sample size? If your sample size is small, and if there's a slight population mismatch, then it is possible for you to get h2 estimate less than 0. 3. I am not sure how LDpred2 handle missing genotypes 4. You can achieve the per-chr file by putting snp_readBed("EUR.QC.bed")# now attach the genotype object obj.bigSNP <- snp_attach("EUR.QC.rds") and any code involves the use of obj.bigSNP inside the loop Sam

Hi Dr. Choi,

I contacted the author of bigsnpr and had a better understanding of the error. By the way, the tutorial in your github (https://choishingwan.github.io/PRS-Tutorial/ldpred/) is kinda different from the one in his github (https://privefl.github.io/bigsnpr/articles/LDpred2.html). And it seems that yours is based on bigsnpr version 1.4.x while the latest version is 1.5.0.

Thanks!

best,

Qiaolan