uec / Issue.Tracker

Automatically exported from code.google.com/p/usc-epigenome-center
0 stars 0 forks source link

vcf storage #580

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
looking at the flowcell analysis dirs, vcfs are taking up a huge amount of 
space.

for 4 lanes of nomeseq  totaled 1.8TB (all files)
of this 1.5TB were just the vcfs. 

is there any reason this should not be bz2'ed?

I propose it should be done as part of the pipeline as well as retroactively.

Original issue reported on code.google.com by zack...@gmail.com on 6 Sep 2013 at 10:03

GoogleCodeExporter commented 8 years ago
Yes, it is my fault. NOMe-seq out put 10X more cytosines than Bisulfite-seq, we 
should zip those VCF files insides the pipeline.

Yaping

Original comment by lyping1...@gmail.com on 7 Sep 2013 at 1:15

GoogleCodeExporter commented 8 years ago
when we ran out of space a few months ago, I had to bzip2 the vcfs. this 
resulted in significant space gains.

Original comment by zack...@gmail.com on 23 Jul 2014 at 6:02