tgen / phoenix

Jetstream compatible workflow template supporting comprehensive analysis of human sequencing data against GRCh38
MIT License
17 stars 6 forks source link

Octopus Evidence BAM #233

Open PedalheadPHX opened 4 years ago

PedalheadPHX commented 4 years ago

https://github.com/tgen/phoenix/blob/9356db739d2a36565564e0b6953e8e52c0fe5303/modules/somatic/octopus.jst#L54

We are writing the evidence BAM per loop but never combine or copy/mv to archive space

bryce-turner commented 4 years ago

Pushed commit 22670d3 with commented out jinja code to avoid the need to rerun all of octopus until after release

PedalheadPHX commented 4 years ago

do you know how big these are? need to know if we cram them? i'd like to get it in now but we might need to decide as you are right it will have major effects on restarts

bryce-turner commented 4 years ago

I haven't ran a complete merge personally, but from look observing each of the evidence bams I would estimate that the merged bam would be around 100 - 200 Mb at most. I'll run the merge to make sure.

bryce-turner commented 4 years ago

Can confirm from merging the evidence bams for MMRF_2531 genome that the total size is only 90 Mb. Likely not worth the time spent for bam to cram. Also discovered a potential bug if the index directory doesn't exist, e.g. no evidence bam was produced for that batch. Will be pushing a fix in a bit.

PedalheadPHX commented 4 years ago

Code exists but is not implemented