Open danpolanco opened 7 months ago
We will add to the New version release milestone and tackle this post Silver Pancake.
As a group we decided on some changes we'd like to make.
original | new |
---|---|
sample_name | sample_name |
fastq_1 | fastq_1 |
fastq_2 | fastq_2 |
primer_bed | primer_bed |
adapters_and_contaminants | contam_fasta |
covid_genome | sc2_ref_fasta |
covid_gff | sc2_ref_gff |
scrub_reads | scrub_reads |
scrub_genome_index | scrub_genome_index |
project_name | project_name |
out_dir | out_dir |
seq_method | seq_platform |
s_gene_amplicons | sc2_s_gene_amplicons_bed |
calc_percent_coverage_py | calc_percent_coverage_py |
version_capture_py | version_capture_py |
original | new |
---|---|
hostile_task | scrub_reads_hostile |
seqyclean | filter_reads_seqyclean |
fastqc | assess_quality_fastqc |
align_reads | align_reads_bwa |
ivar_trim | trim_primers_ivar |
ivar_var | call_variants_ivar |
ivar_consensus | call_consensus_ivar |
bam_stats | calc_bam_stats_samtools |
rename_fasta | rename_fasta |
calc_percent_cvg | calc_percent_coverage |
version_capture | capture_versions |
transfer | transfer_outputs |
original | new |
---|---|
ListFastqFiles | list_fastqs |
Demultiplex | demultiplex_guppy |
concatenate_fastqs | concatenate_fastqs |
Read_Filtering | filter_reads_guppyplex |
Medaka | call_artic_minion_medaka |
exit_wdl | exit_wdl |
Bam_stats | calc_bam_stats_samtools |
Scaffold | scaffold_pyscaf |
rename_fasta | rename_fasta |
calc_percent_cvg | calc_percent_coverage |
get_primer_site_variants | get_primer_variants_bcftools |
transfer | transfer_outputs |
hostile_task | scrub_reads_hostile |
version_capture | capture_versions |
original | new |
---|---|
gcs_fastq_dir | fastq_dir |
sample_name | sample_name |
index_1_id | barcode_id |
primer_set | Remove and create max_read_length variable with default set to 700 |
barcode_kit | barcode_kit |
medaka_model | medaka_model |
scrub_reads | scrub_reads |
Scrub_genome_in- dex | scrub_genome_index |
covid_genome | sc2_ref_fasta |
primer_bed | primer_bed |
s_gene_primer_bed | sc2_s_gene_amplicons_bed |
s_gene_amplicons | sc2_s_gene_amplicons_tsv |
project_name | project_name |
calc_percent_coverage_py | calc_percent_coverage_py |
version_capture_py | version_capture_py |
out_dir | out_dir |
original | new |
---|---|
concatentate | concatenate_fastas |
pangolin | assign_lineage_pangolin |
nextclade | assign_clade_nextclade |
version_capture | capture_workflow_versions |
parse_nextclade | parse_nextclade |
results_table | summarize_results |
create_version_capture_file | capture_task_versions |
transfer | transfer_outputs |
Staying the same as before.
We should also consider how we organizing tasks. For example, if we change to use transfer_outputs
for assembly and summary, we can't keep those in the same file (e.g. transfer_tasks.wdl
) as the name conflicts.
I'm trialing the following in CDPHE-bioinformatics/CDPHE-RSV#4:
Additionally, using call_
as a prefix might be confusing. At least with miniwdl
, it uses call-
:
But that seems fairly minor?
Feature Request
This issue is a solicitation for feedback on an idea.
I've been working on the RSV pipeline and reusing tasks from this repo (i.e. CDPHE-SARS-CoV-2). I haven't renamed any of the tasks as I don't want to break consistency. I do however believe we could improve our task names.
Solution
There are a lot of possible task names so I put together a rough very diagram as a starting point for discussion:
Note I put this diagram together quickly and it doesn't reflect the current SARS-CoV-2 pipeline.
One point to consider, suggested by @arianna-smith, is to keep the tool name in the task name. For example, instead of just
clean_reads
some possibilities are:clean_reads_with_seqyclean
clean_reads_via_seqyclean
clean_reads_seqyclean
Some other considerations are how we are:
All the examples given above are to generate discussion rather than suggest a hard requirement.
Upstream effects
I don't believe changing the tasks names will have any upstream effects.
Downstream effects
I don't believe changing the tasks names will have any downstream effects.