At the moment everything is published, including intermediary files. We might have to think of more long-term solutions for some of these, but a fix in the near future is needed.
Suggest we publish:
Assembly:
[x] Assembled haplotypes 1 and 2 as fasta.gz + assembly stats.
[x] Assembled haplotypes aligned to reference genome as bam+bai.
[x] short-term this will have to include all dipcall files
[ ] #187
Read alignment:
[x] Only the per sample merged aligned reads as bam+bai
[x] short term this will have to include both a set of phased and unphased
[x] #59
Phasing:
[x] Only whatshap stats and haplotagged reads - no longer relevant since we don't use distrust genotypes
[x] Output HiPhase reads (long or short-term don't have 3 parallel phasing programs)
Raw read QC:
[x] All FastQC files - or is MultiQC-report enough?
[x] Keep fqcrs for now, replace long-term, or make it an intermediary file)
Aligned QC:
[x] All mosdepth files
[x] long term - make these into IGV-loadable tracks (not sure what I meant by that)
[x] All cramino files
[x] short-term both phased and unphased
SNVs & SVs:
[x] One SNV-calls per sample (as vcf.gz + g.vcf.gz), plus one additional where all samples are merged
[x] long term - index these
[x] One SNV-calls per sample (as vcf) plus one additional where all samples are merged (as vcf)
[x] long term - index and compress these
Repeat calling (TRGT):
[x] Sorted and indexed bcf.gz + spanning reads bam +bai
[x] long-term make bcf.gz-> vcf.gz
CNV:
[x] Output all files (this includes depth tracks)
SNV-annotation:
[x] Output all files
[x] long-term replace by #23
Methylation:
[x] Methylation pileups
[x] short term both phased and unphased
General:
[x] No tools should exist in the base output directory except maybe multiqc, which at the moment for example bcftools and tabix do.
Description of feature
At the moment everything is published, including intermediary files. We might have to think of more long-term solutions for some of these, but a fix in the near future is needed.
Suggest we publish:
Assembly:
Read alignment:
Phasing:
Raw read QC:
Aligned QC:
SNVs & SVs:
Repeat calling (TRGT):
CNV:
SNV-annotation:
Methylation:
General:
bcftools
andtabix
do.