ChristopherWilks / megadepth

BigWig and BAM utilities
Other
91 stars 9 forks source link

it does not support bw files of wgbs #12

Closed nihaonewworld closed 2 years ago

nihaonewworld commented 2 years ago

Hello, it does not support bw files of wgbs? Thank you

ChristopherWilks commented 2 years ago

Hi @nihaonewworld Could you provide an example wgbs BW file and the exact megadepth command you're running?

Thanks, Chris

nihaonewworld commented 2 years ago

Hi Chris The data I use is https://zwdzwd.s3.amazonaws.com/trackHubs/TCGA_WGBS/hg38/bw_mindepth5/TCGA_BLCA_A13J.vcf.gz_cg_cov5.bw
”megadepth $id --annotation ucsc_cgi.bed --op sum >$resultPath/$(basename $id ".bw")_cgi.bed“ Thanks.

nihaonewworld commented 2 years ago

Hi Chris The data I use is https://zwdzwd.s3.amazonaws.com/trackHubs/TCGA_WGBS/hg38/bw_mindepth5/TCGA_BLCA_A13J.vcf.gz_cg_cov5.bw   "megadepth $id --annotation ucsc_cgi.bed --op sum >$resultPath/$(basename $id ".bw")_cgi.bed" Thanks.

------------------ 原始邮件 ------------------ 发件人: "ChristopherWilks/megadepth" @.>; 发送时间: 2021年10月31日(星期天) 凌晨0:34 @.>; @.**@.>; 主题: Re: [ChristopherWilks/megadepth] it does not support bw files of wgbs (Issue #12)

Hi @nihaonewworld Could you provide an example wgbs BW file and the exact megadepth command you're running?

Thanks, Chris

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

ChristopherWilks commented 2 years ago

thanks @nihaonewworld, my small test case runs on that BW, can you post your annotation file (what you pass to --annotation, ucsc_cgi.bed) please?

Also what error condition are you receiving?

nihaonewworld commented 2 years ago

This is my annotation file, which is relatively large.

This is an error when running on a macbook. Later, when I ran it on the server, this problem did not occur. zsh: segmentation fault  ~/Downloads/megadepth_macos TCGA_STAD_N6452.vcf.gz_cg_cov5.bw --annotation Thanks

------------------ 原始邮件 ------------------ 发件人: "ChristopherWilks/megadepth" @.>; 发送时间: 2021年10月31日(星期天) 中午11:32 @.>; @.**@.>; 主题: Re: [ChristopherWilks/megadepth] it does not support bw files of wgbs (Issue #12)

thanks @nihaonewworld, my small test case runs on that BW, can you post your annotation file (what you pass to --annotation, ucsc_cgi.bed) please?

Also what error condition are you receiving?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

从QQ邮箱发来的超大附件

hg38_repeatmasker.bed (207.83M, 2021年11月30日 14:35 到期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?k=7a3261371cf43796a357c8084733074a0606020f530356001d53020f581e015051034c55575251485204000e52025006550156046126350d570159681356450051460c56125850171e500453610e&t=exs_ftn_download&code=02a7a35e

ChristopherWilks commented 2 years ago

Thanks for the additional info.

Converting the BW file back to a BedGraph file I don't see anything unexpected about it, so there shouldn't a problem with Megadepth supporting it. I don't usually develop on MacOS so it's megadepth binary gets less testing by me at least, so that may be a separate issue.

That said, even on Linux, I am noticing that Megadepth runs for a very long time on the combination of that BW and the hg38 repeats BED file.

My suspicion is that the recent critical bug fix in Megadepth v1.1.1 has slowed the BW processing way down from what it was before and I need to rethink that approach.

A short term workaround I'd suggest, would actually be to run bedtools intersect -sorted -wo -a <BedGraph version of that BW> -b hg38_repeatmasker.sorted.bed (after sorting the hg38 repeats BED file), and then do the summing over the intervals as a post-processing step via your favorite scripting language, it'll be much faster than Megadepth currently (and probably the other tools which do BW vs. BED files).

Also, a side note, do you know if that BW file is actually hg38 or hg37? I ask because in the past TCGA has been hg37 though that may have changed more recently.

ChristopherWilks commented 2 years ago

just to update you, I finished running using the latest Megadepth version (with critical bug fix introduced in Megadepth v1.1.1) it took 1h:31m on a fairly fast CPU using the Linux binary using a locally downloaded copy of the BW file and the hg38 repeats BED. Not sure if that's too long for your use case, but just wanted to say that it does appear to work at least with the Linux version.

nihaonewworld commented 2 years ago

I appreciate your reply. The bw file is based on hg38. I don't mind the issue of running time. I really appreciate your help.

------------------ 原始邮件 ------------------ 发件人: "ChristopherWilks/megadepth" @.>; 发送时间: 2021年11月1日(星期一) 凌晨4:25 @.>; @.**@.>; 主题: Re: [ChristopherWilks/megadepth] it does not support bw files of wgbs (Issue #12)

just to update you, I finished running using the latest Megadepth version (with critical bug fix introduced in Megadepth v1.1.1) it took 1h:31m on a fairly fast CPU using the Linux binary using a locally downloaded copy of the BW file and the hg38 repeats BED. Not sure if that's too long for your use case, but just wanted to say that it does appear to work at least with the Linux version.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.