Closed zillurbmb51 closed 1 year ago
Sometimes susie can take a long time and not converge. This can happen when there are many true signals, when there are many snps or (most common in my experience) when there is a mismatch between the summary data and the LD matrix supplied - sometimes only a small mismatch. Myself, I do not run runsusie in an lapply because of this. When there is a failure to converge, I tend to fall back on using coloc under the single causal variant assumption.
From: zillurbmb51 @.> Sent: 02 June 2022 00:20 To: chr1swallace/coloc @.> Cc: Subscribed @.***> Subject: [chr1swallace/coloc] runsusie: taking too much iterations to reach convegence (Issue #90)
Hi, I am running runsusie for multiple loci: baso_susie=lapply(1:length(baso7), function(i) runsusie(baso7[[i]])) # Here baso7 is a list of lists and each list contains all the required variables for runsusie . It works fine in most cases but for some loci it is taking forever to reach convergence.
Is there any way that I could delete those loci from my datasets?
I do not know the differences between loci. Only thing I am sure that the locus which is taking too much time, has slightly higher number of SNPs. That particular locus has 2236 SNPs, but the same locus with 1947 SNPs in another trait reached convergence after few hundreds iterations.
I am lost and can not think what I need to do.
The objective of my analysis is to use runsusie for a list of loci in some trait then use coloc.susie for the same loci in pair of traits as described in here: https://chr1swallace.github.io/coloc/articles/a06_SuSiE.htmlhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fchr1swallace.github.io%2Fcoloc%2Farticles%2Fa06_SuSiE.html&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C68672d8476d4461ce3a408da44255924%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637897224436009392%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=v%2FLsCeDYJxlXkfPjkD%2FJ4Vr9rd9tV64gl8G4%2FIIum3w%3D&reserved=0
Now, if locus X reached convergence for trait1 but did not for trait2, what should I do? [runsusie_taking_too_much_time]https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F12833907%2F171517094-277b7497-8a55-4414-b2f6-70c409b15320.png&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C68672d8476d4461ce3a408da44255924%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637897224436009392%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pLJQhfIBjanZVc%2BHSkgXifIX7ufZREtBmc%2FLor%2Bq6Vw%3D&reserved=0
Attached is a screenshot of susie running.
— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fchr1swallace%2Fcoloc%2Fissues%2F90&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C68672d8476d4461ce3a408da44255924%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637897224436009392%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=G6mDlY8hiEnSR6%2F57ea85tw4lH06sbnSpZB7%2FNc%2FlYc%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAQWR2FSVUXSPGSTST26RKDVM7V2VANCNFSM5XTCXHPA&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C68672d8476d4461ce3a408da44255924%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637897224436009392%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jJYdvBLY5gtXfKdo5QkTlQYkTEcIh9%2Bt5q%2BQuV7fHMs%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>
@chr1swallace Thanks a lot.
Is there any way to skip a runsusie after a certain number of iteration and then start the next run?
For example, I have 100 loci, I want to run them in loop. If the 5th locus does not reach convergence in 10000 iterations, it would through a message and start 6th locus.
you can control this through the two arguments: maxit
maximum number of iterations for the first run of susie_rss(). If susie_rss() does not report convergence, runs will be extended assuming repeat_until_convergence=TRUE. Most users will not need to change this default.
repeat_until_convergence
keep running until susie_rss() indicates convergence. Default TRUE. If FALSE, susie_rss() will run with maxit iterations, and if not converged, runsusie() will error. Most users will not need to change this default.
see https://chr1swallace.github.io/coloc/reference/runsusie.html for other arguments [https://chr1swallace.github.io/coloc/logo.png]https://chr1swallace.github.io/coloc/reference/runsusie.html Run susie on a single coloc-structured dataset — runsusiehttps://chr1swallace.github.io/coloc/reference/runsusie.html run susie_rss storing some additional information for coloc chr1swallace.github.io
From: zillurbmb51 @.> Sent: 06 June 2022 11:54 To: chr1swallace/coloc @.> Cc: Chris Wallace @.>; Mention @.> Subject: Re: [chr1swallace/coloc] runsusie: taking too much iterations to reach convegence (Issue #90)
Is there any way to skip a runsusie after a certain number of iteration and then start the next run?
For example, I have 100 loci, I want to run them in loop. If the 5th locus does not reach convergence in 10000 iterations, it would through a message and start 6th locus.
— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fchr1swallace%2Fcoloc%2Fissues%2F90%23issuecomment-1147318963&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C83fe00cc62374701032508da47aafc78%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637901096967104152%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hlCr0q%2B6i3aNscE7vzIP95ld4RWP1PNHJmm0tgqAs88%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAQWR2ELSZVBB7FHY2PEI3TVNXKF3ANCNFSM5XTCXHPA&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C83fe00cc62374701032508da47aafc78%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637901096967104152%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LPotI8ZXiJfisQsvzyqV9xE%2B3FNRa1cs1ldn05d33HI%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
Thank you. I have tried these options maxit=10000, repeat_until_convergence=F
. With these options, if a locus fails to reach convergence, runsusie
does not start the next locus, it throw an error and exit. Below is the error log.
How could I use runsusie
for multiple loci in a way that if convergence fails in one locus it gives the message and starts to run the next locus?
scz_baso_susie=lapply(1:length(scz_baso7), function(i) runsusie(scz_baso7[[i]],check_prior=F,maxit = 10000,repeat_until_convergence = F))
running max iterations: 10000
converged: TRUE
running max iterations: 10000
converged: TRUE
running max iterations: 10000
converged: FALSE
Error in runsusie(scz_baso7[[i]], check_prior = F, maxit = 10000, repeat_until_convergence = F) :
susie_rss() did not converge in 10000 iterations. Try running with run_until_convergence=TRUE
In addition: Warning messages:
1: In check_dataset(d, suffix, req = c("beta", "varbeta", "LD", "snp")) :
minimum p value is: 0.00019787
If this is what you expected, this is not a problem.
If this is not as small as you expected, please check the 02_data vignette.
2: In check_dataset(d, suffix, req = c("beta", "varbeta", "LD", "snp")) :
minimum p value is: 7.6246e-05
If this is what you expected, this is not a problem.
If this is not as small as you expected, please check the 02_data vignette.
3: In susie_suff_stat(XtX = R, Xty = z, n = 2, yty = 1, scaled_prior_variance = prior_variance, :
Error in runsusie(scz_baso7[[i]], check_prior = F, maxit = 10000, repeat_until_convergence = F) :
susie_rss() did not converge in 10000 iterations. Try running with run_until_convergence=TRUE
you will need to use an R mechanism for catching errors, such as tryCatch(). I do believe that failure to converge needs to be an error, because otherwise people will base inference on the non-converged runs.
From: zillurbmb51 @.> Sent: 06 June 2022 14:09 To: chr1swallace/coloc @.> Cc: Chris Wallace @.>; Mention @.> Subject: Re: [chr1swallace/coloc] runsusie: taking too much iterations to reach convegence (Issue #90)
Thank you. I have tried these options maxit=10000, repeat_until_convergence=F . With these options, if a locus fails to reach convergence, runsusie does not start the next locus, it throw an error and exit. Below is the error log.
How could I use runsusie for multiple loci in a way that if convergence fails in one locus it gives the message and starts to run the next locus?
scz_baso_susie=lapply(1:length(scz_baso7), function(i) runsusie(scz_baso7[[i]],check_prior=F,maxit = 10000,repeat_until_convergence = F)) running max iterations: 10000 converged: TRUE running max iterations: 10000 converged: TRUE running max iterations: 10000 converged: FALSE Error in runsusie(scz_baso7[[i]], check_prior = F, maxit = 10000, repeat_until_convergence = F) : susie_rss() did not converge in 10000 iterations. Try running with run_until_convergence=TRUE In addition: Warning messages: 1: In check_dataset(d, suffix, req = c("beta", "varbeta", "LD", "snp")) : minimum p value is: 0.00019787 If this is what you expected, this is not a problem. If this is not as small as you expected, please check the 02_data vignette. 2: In check_dataset(d, suffix, req = c("beta", "varbeta", "LD", "snp")) : minimum p value is: 7.6246e-05 If this is what you expected, this is not a problem. If this is not as small as you expected, please check the 02_data vignette. 3: In susie_suff_stat(XtX = R, Xty = z, n = 2, yty = 1, scaled_prior_variance = prior_variance, :
Error in runsusie(scz_baso7[[i]], check_prior = F, maxit = 10000, repeat_until_convergence = F) : susie_rss() did not converge in 10000 iterations. Try running with run_until_convergence=TRUE
— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fchr1swallace%2Fcoloc%2Fissues%2F90%23issuecomment-1147429673&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C570cbe1b9ad14db54ae408da47bdf071%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637901178345646369%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YSCFoG6pDhqWNHztk9YNjj5pIVTMaCvmgIgvNUu4bGA%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAQWR2GS4UBVLKGMN47NMHDVNX2CPANCNFSM5XTCXHPA&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7C570cbe1b9ad14db54ae408da47bdf071%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637901178345646369%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=5vvzC%2B3hVqOoodHBiGuvh6CJRGxL%2BdeErDQtLeibU9g%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
Another idea: you could try to use the diagnostic functions provided by susieR
For example, you could assess the mismatch between your summary statistics and LD matrix using estimate_s_rss
. If you run it on loci you know converge and those you know don't converge, you could determine an empirical value of lambda to filter loci that are too mismatched prior to fine-mapping
Another idea: you could try to use the diagnostic functions provided by susieR
For example, you could assess the mismatch between your summary statistics and LD matrix using
estimate_s_rss
. If you run it on loci you know converge and those you know don't converge, you could determine an empirical value of lambda to filter loci that are too mismatched prior to fine-mapping
Would it be okay if I determine the empirical value of lambda to be 0.05?
Would it be okay if I determine the empirical value of lambda to be 0.05?
@cwnag-c I think the susieR authors would be best able to answer that question: https://github.com/stephenslab/susieR/issues
OK!Thank you!
Hi, I am running
runsusie
for multiple loci:baso_susie=lapply(1:length(baso7), function(i) runsusie(baso7[[i]])) # Here baso7 is a list of lists and each list contains all the required variables for runsusie
. It works fine in most cases but for some loci it is taking forever to reach convergence.Is there any way that I could delete those loci from my datasets?
I do not know the differences between loci. Only thing I am sure that the locus which is taking too much time, has slightly higher number of SNPs. That particular locus has 2236 SNPs, but the same locus with 1947 SNPs in another trait reached convergence after few hundreds iterations.
I am lost and can not think what I need to do.
The objective of my analysis is to use
runsusie
for a list of loci in some trait then usecoloc.susie
for the same loci in pair of traits as described in here: https://chr1swallace.github.io/coloc/articles/a06_SuSiE.htmlNow, if locus X reached convergence for trait1 but did not for trait2, what should I do?![runsusie_taking_too_much_time](https://user-images.githubusercontent.com/12833907/171517094-277b7497-8a55-4414-b2f6-70c409b15320.png)
Attached is a screenshot of susie running.