bulik / ldsc

LD Score Regression (LDSC)
GNU General Public License v3.0
601 stars 331 forks source link

Update munge_sumstats.py #391

Open evlim opened 11 months ago

evlim commented 11 months ago

Fixed the inconsistent use of tabs and spaces in indentation

laleoarrow commented 4 months ago

(2024/01/24) I used munge_sumstats.py to process my GWAS data from the IEU or GWAS catalog (~millions of snps), but munge_sumstats.py was stuck there forever (the last message is "Reading sumstats from ldsc_formatted.tsv into memory 5000000 SNPs at a time.") . ~100K SNP summary could finish running in ten seconds or so.

The server used in computing ldsc has RAM of 512G and two CPU of 24 threads (Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz). The munge process only used 1 thread when running. Definiatly needs some help! Where could I go wrong? image

chirrie commented 2 months ago

Did you get some help. munge_sumstats.py is also taking forever on my end.

laleoarrow commented 2 months ago

You can try add a chunksize setting in the command. It will help.

该邮件从移动设备发送

--------------原始邮件-------------- 发件人:"Vivien Chebii @.>; 发送时间:2024年3月6日(星期三) 下午5:56 收件人:"bulik/ldsc" @.>; 抄送:"leoarrow1 @.>;"Comment @.>; 主题:Re: [bulik/ldsc] Update munge_sumstats.py (PR #391)

Did you get some help. munge_sumstats.py is also taking forever on my end.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

chirrie commented 2 months ago

For the chunk size do you provide size or chromosomes? I can not find more information on the READMe

On Wed, Mar 6, 2024 at 11:57 AM leoarrow1 @.***> wrote:

You can try add a chunksize setting in the command. It will help.

该邮件从移动设备发送

--------------原始邮件-------------- 发件人:"Vivien Chebii @.>; 发送时间:2024年3月6日(星期三) 下午5:56 收件人:"bulik/ldsc" @.>; 抄送:"leoarrow1 @.>;"Comment @.>; 主题:Re: [bulik/ldsc] Update munge_sumstats.py (PR #391)

Did you get some help. munge_sumstats.py is also taking forever on my end.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/bulik/ldsc/pull/391#issuecomment-1980493494, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZTQ2V23GBQEXYJYRQMYZTYW3SCPAVCNFSM6AAAAAAZHHRM5SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBQGQ4TGNBZGQ . You are receiving this because you commented.Message ID: @.***>

--

Vivien Chebii, Ph.D. Post doctoral Scientist (Bioinformatics)

Sydney Brenner Institute for Molecular Biosciences,

Wits University, South Africa

Email: @. @.>*; @.***

Mobile: +254 722 712 248

Linkedin: www.linkedin.com/in/viviennechebii/

laleoarrow commented 2 months ago

It depends on the device you use for computing. Try read the original code of munge_sumstats.py in github. There are things that are not clarified enough in README file---thats bad.

@.***

From: Vivien Chebii Date: 2024-03-07 22:14 To: bulik/ldsc CC: leoarrow1; Comment Subject: Re: [bulik/ldsc] Update munge_sumstats.py (PR #391) For the chunk size do you provide size or chromosomes? I can not find more information on the READMe

On Wed, Mar 6, 2024 at 11:57 AM leoarrow1 @.***> wrote:

You can try add a chunksize setting in the command. It will help.


该邮件从移动设备发送

--------------原始邮件-------------- 发件人:"Vivien Chebii @.>; 发送时间:2024年3月6日(星期三) 下午5:56 收件人:"bulik/ldsc" @.>; 抄送:"leoarrow1 @.>;"Comment @.>; 主题:Re: [bulik/ldsc] Update munge_sumstats.py (PR #391)


Did you get some help. munge_sumstats.py is also taking forever on my end.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/bulik/ldsc/pull/391#issuecomment-1980493494, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZTQ2V23GBQEXYJYRQMYZTYW3SCPAVCNFSM6AAAAAAZHHRM5SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBQGQ4TGNBZGQ . You are receiving this because you commented.Message ID: @.***>

--

Vivien Chebii, Ph.D. Post doctoral Scientist (Bioinformatics)

Sydney Brenner Institute for Molecular Biosciences,

Wits University, South Africa

Email: @. @.>*; @.***

Mobile: +254 722 712 248

Linkedin: www.linkedin.com/in/viviennechebii/ — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>