Closed jasonwalker80 closed 3 years ago
Doing a sweep through for unneeded/outdated tools should be part of our 2.0 release
Here are the tools and subworkflows that are not referenced in any workflows under pipelines
:
1 vcf_eval_concordance.cwl
1 vcf_eval_cle_gold.cwl
1 sompy.cwl
1 somatic_concordance_graph.cwl
1 single_cell_rnaseq.cwl
1 sequence_align_and_tag.updatedpicard.cwl
1 samtools_mpileup.cwl
1 rename.cwl
1 pvacvector.cwl
1 pvacfuse.cwl
1 pvacbind.cwl
1 position_sort.cwl
1 pizzly.cwl
1 molecular_qc.cwl
1 merge_uncompressed_vcf.cwl
1 kmer_size_from_index.cwl
1 joint_genotype.cwl
1 grolar.cwl
1 gatk_genotypegvcfs.cwl
1 filter_vcf_exac.cwl
1 fastq_to_bqsr.cwl
1 fastq_align_and_tag.cwl
1 eval_vaf_report.cwl
1 eval_cle_gold.cwl
1 downsampled_alignment.cwl
1 deeptools_bamcoverage.cwl
1 cram_to_cnvkit.cwl
1 cram_to_bam_and_index.cwl
1 cram_to_bam.cwl
1 combine_variants_concordance.cwl
1 combine_gvcfs.cwl
1 cellranger_vdj.cwl
1 cellranger_mkfastq_and_count.cwl
1 cellranger_mkfastq.cwl
1 cellranger_feature_barcoding.cwl
1 cellranger_count.cwl
1 cellranger_atac_count.cwl
1 cellmatch_lineage.cwl
1 bedtools_intersect.cwl
1 bam_to_bqsr_no_dup_marking.cwl
1 bam_to_bqsr.cwl
Generated with this admittedly ugly one-liner (subsequently restricted to only those lines with count 1):
for single_iteration in 1; do echo ~/git/analysis-workflows/definitions/pipelines/*.cwl | xargs -n 1 /usr/local/bin/cwltool --pack | grep "id" | cut -f 4 -d '"' | cut -f 1 -d '/' | cut -f 2 -d '#' | sort | uniq | grep '.cwl' | grep -v '_2'; \ls ~/git/analysis-workflows/definitions/tools/ ~/git/analysis-workflows/definitions/subworkflows/ ~/git/analysis-workflows/definitions/pipelines/ ~/git/analysis-workflows/definitions/pipelines/ | grep '.cwl'; done | sort | uniq -c | sort -r -n
To remove: 1 samtools_mpileup 1 position_sort.cwl 1 filter_vcf_exac.cwl 1 fastq_to_bqsr.cwl 1 fastq_align_and_tag.cwl 1 deeptools_bamcoverage.cwl 1 bam_to_bqsr_no_dup_marking.cwl 1 bam_to_bqsr.cwl
Notes:
cram_to_cnvkit.cwl - after cnvkit update could be removed make an no-dedup workflow for alignment - make duplication optional?
For information on pvacvector.cwl and pvacbind.cwl (above listed), When Mike and I make a manual review on neo-epitopes of cancer patients in clinical trials, we manually arranged pvacbind and pvacvector analysis to finalize vector insert design. These CWL workflows (pvacvector.cwl and pvacbind.cwl, pvacfuse.cwl) are potentially what we need to use during our clinical study for cancer vaccine design in the future.
One example varscan/samtools_mpileup.cwl is no longer used.