nf-core / ampliseq

Amplicon sequencing analysis workflow using DADA2 and QIIME2
https://nf-co.re/ampliseq
MIT License
188 stars 119 forks source link

SUMMARY_REPORT failing for silva=132 database #780

Closed sonalhenson closed 1 month ago

sonalhenson commented 1 month ago

Description of the bug

When running the workflow with "--dada_ref_taxonomy silva=132" option it fails at the SUMMARY_REPORT generation stage. The error I get is as below.

Upon a closer look at the .command.sh it seems like the single quote/apostrophe in the "dada2_ref_tax_title='Silva Project's version 132 release'" variable is causing the error. If I cancel the single quote/apostrophe with a backslash in the .command.sh and re-run .command.run the job completes without an error.

This is not a problem when using silva=138. I haven't tried any other databases but just eyeballing the 'title' field for other databases in ref_databases.config file (where I'm guessing the script is pulling the information from), I don't see any other single quotes.

For now I can generate the report by manually correcting .command.sh and executing .command.run in the work directory but it'd be nice to be able to run the workflow all through without this additional step.

Lastly, I really appreciate all the work that's gone in to putting together this resource.

Command used and terminal output

nextflow run nf-core/ampliseq \ -profile singularity \ -w ${OUT_DIR}/work \ -c my_config.cfg \ -process.executor='slurm' \ --input "$SAMPLESHEET" \ --metadata "${PROJECT_DIR}/metadata.tsv" \ --FW_primer "$F" \ --RV_primer "$R" \ --outdir "$OUT_DIR" \ --save_intermediates \ --vsearch_cluster \ --dada_ref_taxonomy "silva=132" \ --skip_barrnap --skip_qiime_downstream

-[nf-core/ampliseq] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_AMPLISEQ:AMPLISEQ:SUMMARY_REPORT (1)'

Caused by:
Process NFCORE_AMPLISEQ:AMPLISEQ:SUMMARY_REPORT (1) terminated with an error exit status (1)

Command executed:

!/usr/bin/env Rscript

library(rmarkdown)

Work around https://github.com/rstudio/rmarkdown/issues/1508

If the symbolic link is not replaced by a physical file

output- and temporary files will be written to the original directory.

file.copy("./report_template.Rmd", "./template.Rmd", overwrite = TRUE)

rmarkdown::render("template.Rmd", output_file = "summary_report.html", params = list(css='nf-core_style.css',report_logo='nf-core-ampliseq_logo_light_long.png',workflow_manifest_version='2.11.0',workflow_scriptid='ce811bec9b',report_title='Summary of analysis results',metadata='metadata.tsv',input_samplesheet='samplesheet-Chiar.tsv',mqc_plot='multiqc_plots/svg/fastqc_per_sequence_quality_scores_plot.svg',cutadapt_summary='cutadapt_summary.tsv',trunc_qmin=25,trunc_rmin=0.75,trunclenf='null',trunclenr='null',max_ee=2,dada_qc_f_path='FW_qual_stats.svg',dada_qc_r_path='RV_qual_stats.svg',dada_pp_qc_f_path='FW_preprocessed_qual_stats.svg',dada_pp_qc_r_path='RV_preprocessed_qual_stats.svg',dada_filtntrim_args='filterAndTrim.args.txt',dada_sample_inference='independent',dada_err_path='1_1.err.svg,1_2.err.svg',dada_err_run='1',asv_table_path='ASV_table.tsv',path_asv_fa='ASV_post_clustering_filtered.fna',path_dada2_tab='DADA2_table.tsv',dada_stats_path='DADA2_stats.tsv',vsearch_cluster='ASV_post_clustering_filtered.table.tsv',vsearch_cluster_id='0.97',min_len_asv=0,max_len_asv=0,dada2_taxonomy='ASV_tax_species.silva_132.tsv',dada2_ref_tax_title='Silva Project's version 132 release',dada2_ref_tax_file='[https://zenodo.org/record/1172783/files/silva_nr_v132_train_set.fa.gz, https://zenodo.org/record/1172783/files/silva_species_assignment_v132.fa.gz]',dada2_ref_tax_citation='Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glöckner FO. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013 Jan;41(Database issue):D590-6. doi: 10.1093/nar/gks1219. Epub 2012 Nov 28. PMID: 23193283; PMCID: PMC3531112.',phyloseq='phyloseq,dada2_phyloseq.rds'), envir = new.env())

writeLines(c("\"NFCORE_AMPLISEQ:AMPLISEQ:SUMMARY_REPORT\":", paste0(" R: ", paste0(R.Version()[c("major","minor")], collapse = ".")), paste0(" rmarkdown: ", packageVersion("rmarkdown")), paste0(" knitr: ", packageVersion("knitr")) ), "versions.yml")

Command exit status: 1

Command output: [1] TRUE

Command error: INFO: Converting SIF file to temporary sandbox... WARNING: While bind mounting '/path/to/sing_tmp:/path/to/sing_tmp': destination is already in the mount point list WARNING: While bind mounting '/path/to/sing_tmp:/path/to/sing_cache': destination is already in the mount point list Error: unexpected symbol in "ada2_tab='DADA2_table.tsv',dada_stats_path='DADA2_stats.tsv',vsearch_cluster='ASV_post_clustering_filtered.table.tsv',vsearch_cluster_id='0.97',min_len_asv=0,max_len_asv=0,dada2taxonomy='ASV" Execution halted INFO: Cleaning up image...

Relevant files

I'll attach the log files if you need them.

System information

d4straub commented 1 month ago

Thanks for the clear error report and the analysis of the problem! That shall be corrected in the next release!

d4straub commented 1 month ago

This was fixed in dev and will be part of the next release.