FelixKrueger / Bismark

A tool to map bisulfite converted sequence reads and determine cytosine methylation states
http://felixkrueger.github.io/Bismark/
GNU General Public License v3.0
385 stars 101 forks source link

feature request: gzip CpG_report.txt from bismark_methylation_extractor #7

Closed avilella closed 8 years ago

avilella commented 8 years ago

This is a feature request to lower the disk footprint of a bismark_methylation_extractor run.

I ran master branch including the options --cytosine_report --gzip --bedGraph and I was glad to see that most of the large output files are compressed, except the CpG_report.txt file.

Would it be possible to also gzip that one?

[        256 Dec  1 14:01]  ./TEST_Run001_meth-12345678
[      14382 Dec  1 15:19]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678
[ 1387611650 Dec  1 16:26]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.CpG_report.txt
[    5314308 Dec  1 15:18]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.bedGraph.gz
[    5135404 Dec  1 15:18]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.bismark.cov.gz
[        825 Dec  1 15:17]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.bam_splitting_report.txt
[       9204 Dec  1 15:17]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.M-bias_R1.png
[      12211 Dec  1 15:17]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.M-bias.txt
[   57078031 Dec  1 15:17]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/CHG_CTOB_TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.txt.gz
[   59273925 Dec  1 15:17]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/CHG_CTOT_TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.txt.gz
[  118532452 Dec  1 15:17]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/CHH_CTOB_TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.txt.gz
[  122569601 Dec  1 15:17]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/CHH_CTOT_TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.txt.gz
[   22190643 Dec  1 15:17]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/CpG_CTOB_TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.txt.gz
[   22963327 Dec  1 15:17]  ./TEST_Run001_meth-12345678/TST36_80_7_oxBS-12345678/CpG_CTOT_TST36-80-7_S1_L001_R1_001.cutB.sorted.mrgd.dedupled.txt.gz

Thx

FelixKrueger commented 8 years ago

Yes that would certainly be possible. I'll add it to the list.

avilella commented 8 years ago

Brill! On 1 Dec 2015 16:51, "FelixKrueger" notifications@github.com wrote:

Yes that would certainly be possible. I'll add it to the list.

— Reply to this email directly or view it on GitHub https://github.com/FelixKrueger/Bismark/issues/7#issuecomment-161028650.

FelixKrueger commented 8 years ago

Added a new option --gzip to coverage2cytosine. The option --gzip is now also passed on from the methylation extractor, so I hope that this will do exactly what you needed. Closes 40fb13cef23ec75db24e51f254721ecb20ba783c.