JonJala / mama

MIT License
13 stars 4 forks source link

run mama.py at individual chr level #28

Open arkyl opened 2 years ago

arkyl commented 2 years ago

Hi, Thanks much for the new software.

  1. I am planning to use 1000G EUR/AFR to construct reference panel at individual chr level. Then I am wondering if I can run the following mama.py meta gwas at individual chr level as well, which I suppose is faster to run in parallel than at genome-wide level. I know the omega and sigma values are estimated genome-wide, so I don't know if it's ok to run meta gwas at chr level.
  2. I noticed in the bioRxiv paper, it said "The LD scores were constructed using the –std-geno-ldsc units option, which assumes that the genotypes in the model are in allele-count units". However, the help message from mama_ldscores.py says "--std-geno-ldsc Generate LD scores from standardized genotypes (default is allele counts)." I would like to use allele count unit so I guess I would follow the mama_ldscores.py help message and assume it's default? Thanks a lot!

Yue

paturley commented 2 years ago

Hi Yue,

Thanks for your interest in MAMA. A couple responses

  1. You are exactly right that the one concern with running MAMA at the chromosome level is that it will assign different Omega and Sigma matrices for each chromosome and the LD score regression estimates used to construct those matrices may be less precise than if you ran this genome-wide. That said, MAMA is a pretty fast software (other than the creation of the LD scores), so I don't expect that you would need to parallelize. If the LD score creation step is too slow for you, that can be parallelized without any problems.

  2. For now, I'd recommend you use the standardized genotype option. We've recently been noticing some strange behavior in MAMA with the allele count option is used, so I'm a little nervous that there might be a bug. We are trying to resolve this as fast as possible. In practice, using the standardized genotype option should produce very similar results to the allele count option except for SNPs where there are very large allele frequency differences between the ancestries.

Best, Patrick

On Wed, Sep 29, 2021 at 5:05 AM ylark @.***> wrote:

Hi, Thanks much for the new software.

  1. I am planning to use 1000G EUR/AFR to construct reference panel at individual chr level. Then I am wondering if I can run the following mama.py meta gwas at individual chr level as well, which I suppose is faster to run in parallel than at genome-wide level. I know the omega and sigma values are estimated genome-wide, so I don't know if it's ok to run meta gwas at chr level.
  2. I noticed in the bioRxiv paper, it said "The LD scores were constructed using the –std-geno-ldsc units option, which assumes that the genotypes in the model are in allele-count units". However, the help message from mama_ldscores.py says "--std-geno-ldsc Generate LD scores from standardized genotypes (default is allele counts)." I would like to use allele count unit so I guess I would follow the mama_ldscores.py help message and assume it's default? Thanks a lot!

Yue

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JonJala/mama/issues/28, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5KZB5AV2MYRG7FSKA3UELJHHANCNFSM5E7LI73Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

arkyl commented 2 years ago

Thanks a lot for the detailed reply! Yue