mskcc / tempo

CCS research pipeline to process WES and WGS TN pairs
https://cmotempo.netlify.com/
12 stars 5 forks source link

Remove unnecessary files from SvABA published files #982

Open anoronh4 opened 1 year ago

anoronh4 commented 1 year ago

https://github.com/mskcc/tempo/blob/1d215b41ba21d38bb1d429bdceef76c1f1c1a1dd/modules/process/SV/SomaticRunSvABA.nf#L3

Currently, the publishDir directive in SomaticRunSvABA and GermlineRunSvABA both output all vcf files found in the work directory. SvABA generates several vcf files including indel calls which we might consider not publishing or not generating at all if we can help it, because we don't use them and they aren't being added to the maf. The process SomaticRunSvABA and GermlineRunSvABA also produce vcfs file with a slightly modified header, which helps when merging the SV calls in subsequent steps. I think we should also not publish these files, as there is nothing different in the calls compared to the original vcf.

johnoooh commented 1 year ago

I'm using the SVABA indels for consensus calling so I wouldn't change them for now.

anoronh4 commented 1 year ago

then maybe we should make this configurable? I wonder if it will confuse endUsers if they see the svaba indel files.