choishingwan / PRS-Tutorial

A tutorial on how to run basic polygenic risk score analysis
MIT License
71 stars 114 forks source link

No 'SNP matching' in the Lassosum #40

Closed fmadani closed 1 year ago

fmadani commented 2 years ago

Hi Sam @choishingwan ,

When we compare LDpred and Lassosum, we see that SNP's matching is used in the LDpred, but it is not used in the Lassosum. Would you please explain if there is any possible reason for that.

Regards, Farshad

choishingwan commented 2 years ago

You should do matching. Though lassosum.pipeline does to some extend try to match the variants automatically (using chromosome and coordinate information)

fmadani commented 2 years ago

Thank you Sam. I tried to more deeply understand lassosum.pipleine, but reviewing the codes here and here didn't help me, so it is still like a black box for me. Are you aware of any other source that helps better comprehend the algorithm used in the lassosum library? For instance, you mentioned that lassosum.pipleine does some matching automatically; how about the other steps taken in the lassosum algorithm? Farshad

fmadani commented 2 years ago

@choishingwan Here is a diagram that I drew for the LDpred.

PRS-LDPred-2 drawio

and here is the diagram of the Lassosum:

PRS-Lassosum drawio

As you see, the LDpred diagram is clear and an audience can understand the overall logic used in its algorithm . On the Lassosum diagram, 'Lassosum modelling' is like a black box. An audience cannot figure out the main steps taken in its algorithm. Please pay attention that when I say algorithm, I don't mean the math formulas explained in the papers published.

Farshad

choishingwan commented 2 years ago

The algorithm was clearly presented in the lassosum paper

On Tue, 1 Mar 2022 at 5:47 PM, Farshad Madani @.***> wrote:

@choishingwan https://github.com/choishingwan Here is a diagram that I drew for the LDpred.

[image: PRS-LDPred-2 drawio] https://user-images.githubusercontent.com/7520134/156260614-90c266be-0b5b-43d2-9ac5-643f97008b79.png

and here is the diagram of the Lassosum:

[image: PRS-Lassosum drawio] https://user-images.githubusercontent.com/7520134/156260721-b2251094-ddd2-4deb-9153-c6744c234eba.png

As you see, the LDpred diagram is clear and an audience can understand the overall logic used in its algorithm . On the Lassosum diagram, 'Lassosum modelling' is like a black box. An audience cannot figure out the main steps taken in its algorithm. Please pay attention that when I say algorithm, I don't mean the math formulas explained in the papers published.

I tagged Florian @privefl https://github.com/privefl and Timothy @tshmak https://github.com/tshmak to let them to participate in this post if they'd like.

Farshad

— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRS-Tutorial/issues/40#issuecomment-1055936340, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYQ7LVPOUH6MXQTHLHLU52NBTANCNFSM5PVI2YPQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- Dr Shing Wan Choi Instructor Genetics and Genomic Sciences Icahn School of Medicine, Mount Sinai, NYC

fmadani commented 2 years ago

Sam, Do you mean 'Polygenic scores via penalized regression on summary statistics' paper. As far as I understood, the paper explains the proof of the math part of the algorithm. My request is more general. I'd like to know more about the steps taken in the Lassosum algorithm implemented in the lassosum library. I do understand that there is a compatibility between the math formulas and the algorithm, but I'd like to know the steps beyond the math like; for example, you referred to the SNP's matching. Can you provide a list of steps used in the algorithm of lassosum implemented in the lassosum library?

choishingwan commented 2 years ago

The main pipeline is here: https://github.com/tshmak/lassosum/blob/master/R/lassosum.pipeline.R

tshmak commented 2 years ago

Matching between the summary stats, ref.bfile, and test.bfile is done from https://github.com/tshmak/lassosum/blob/0e44b530c77e862f1b88f9e1fad88fdab44e7827/R/lassosum.pipeline.R#L209 - https://github.com/tshmak/lassosum/blob/0e44b530c77e862f1b88f9e1fad88fdab44e7827/R/lassosum.pipeline.R#L267. You can also take a look at https://github.com/tshmak/lassosum/blob/master/R/parse.pheno.covar.R for matching the phenotype and covariate files.