Closed DrB-S closed 1 year ago
That's a strange error. Looks like it might not be able to pair your files. It might be worthwhile to create a sample sheet instead and read that in with the --sample_sheet
The sample sheet format is
sample,fastq_1,fastq_2 AN20230313,reads/AN20230313_1.fastq.gz,reads/AN20230313_2.fastq.gz
Etc
Unfortunately, the sample sheet did not help. When I use --fastas and --sample_sheet on the command-line, without referring to a config file, it runs the sarscov2 analysis:
Using the subworkflow for SARS-CoV-2 The files and directory for results is /data/Sequence_analysis/Cecret/Analyses/Covid_wastewater/wastewater_25Apr2023/cecret Sample sheet found : /data/Sequence_analysis/Cecret/Analyses/Covid_wastewater/wastewater_25Apr2023/sample_sheet.txt Amplicon BedFile : /app/becksts/.nextflow/assets/UPHL-BioNGS/Cecret/schema/artic_V4_SARS-CoV-2.insert.bed Reference Genome : /app/becksts/.nextflow/assets/UPHL-BioNGS/Cecret/genomes/MN908947.3.fasta GFF file for Reference Genome : /app/becksts/.nextflow/assets/UPHL-BioNGS/Cecret/genomes/MN908947.3.gff Primer BedFile : /app/becksts/.nextflow/assets/UPHL-BioNGS/Cecret/schema/artic_V4_SARS-CoV-2.primer.bed Paired-end Fastq files found : null Paired-end Fastq files found : null Paired-end Fastq files found : null Paired-end Fastq files found : null Paired-end Fastq files found : null Paired-end Fastq files found : null Paired-end Fastq files found : null Paired-end Fastq files found : null
The fastq files are actually found, even though the message indicates otherwise, and the program is comparing those files against sarscov2. This does not find anything obvious prevents analysis of other viral genomes. If I specify a config file, it fails right away.
Is there a way to use Cecret to determine which viral genomes are in the wastewater and compare against those, instead of comparing specifically for sarscov2?
Nextflow can't parse the names of your fastq files (see the warning under https://github.com/UPHL-BioNGS/Cecret#getting-files-from-directories)
To side-step this issue:
Is there a way to use Cecret to determine which viral genomes are in the wastewater and compare against those, instead of comparing specifically for sarscov2?
Cecret is reference-based at its core, and is expecting the user to know what genome they are looking for. I think you are hoping for a more-metagenomic analysis. Cecret can use Kraken2 to classify reads, but then it doesn't not attempt to bin them or align them to multiple references.
Have you tried MAG?
Thanks! I’ll try MAG.
Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor
Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @.***
On May 1, 2023, at 8:53 AM, Young @.***> wrote:
Is there a way to use Cecret to determine which viral genomes are in the wastewater and compare against those, instead of comparing specifically for sarscov2?
Cecret is reference-based at its core, and is expecting the user to know what genome they are looking for. I think you are hoping for a more-metagenomic analysis. Cecret can use Kraken2 to classify reads, but then it doesn't not attempt to bin them or align them to multiple references.
Have you tried MAG https://nf-co.re/mag?
— Reply to this email directly, view it on GitHub https://github.com/UPHL-BioNGS/Cecret/issues/166#issuecomment-1529859098, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTVLJWCI2P5HBCVRPW4J6LXD7L6RANCNFSM6AAAAAAXPWZSKE. You are receiving this because you authored the thread.
-- CONFIDENTIALITY NOTICE: This e-mail is the property of the Arizona Department of Health Services and contains information that may be PRIVILEGED, CONFIDENTIAL, or otherwise exempt from disclosure by applicable law. It is intended only for the person(s) to whom it is addressed. If you have received this communication in error, please do not retain or distribute it. Please notify the sender immediately by e-mail at the address shown above and delete the original message. Thank you.
Best of luck to you! If you run into issues with MAG, just ask their slack channel. They are a friendly, helpful bunch of people in my experience.
Thanks
Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor
Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @.***
On May 1, 2023, at 8:58 AM, Young @.***> wrote:
Best of luck to you! If you run into issues with MAG, just ask their slack channel. They are a friendly, helpful bunch of people in my experience.
— Reply to this email directly, view it on GitHub https://github.com/UPHL-BioNGS/Cecret/issues/166#issuecomment-1529871278, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTVLJUJICAHYYARN45MJELXD7MTXANCNFSM6AAAAAAXPWZSKE. You are receiving this because you authored the thread.
-- CONFIDENTIALITY NOTICE: This e-mail is the property of the Arizona Department of Health Services and contains information that may be PRIVILEGED, CONFIDENTIAL, or otherwise exempt from disclosure by applicable law. It is intended only for the person(s) to whom it is addressed. If you have received this communication in error, please do not retain or distribute it. Please notify the sender immediately by e-mail at the address shown above and delete the original message. Thank you.
I am getting an error running Cecret v.3.6.20230425 on a set of gzipped paired-end reads from wastewater. Here is the command-line:
nextflow -bg run UPHL-BioNGS/Cecret -profile singularity --reads reads
Error message: Missing
fromPath
parameter FATAL : No input files were found! No paired-end fastq files were found at /data/Sequence_analysis/Cecret/Analyses/Covid_wastewater/wastewater_25Apr2023/reads. Set 'params.reads' to directory with paired-end readsHere is a subset of the reads files: -rwxrwxr-x 2 becksts becksts 836330937 Apr 25 16:42 AN20230313_1.fastq.gz -rwxrwxr-x 2 becksts becksts 858250873 Apr 25 16:42 AN20230313_2.fastq.gz -rwxrwxr-x 2 becksts becksts 1105000849 Apr 25 16:42 BR20230307_1.fastq.gz -rwxrwxr-x 2 becksts becksts 1140275026 Apr 25 16:42 BR20230307_2.fastq.gz
I have tried changing the path to an absolute path, but I cannot get past this error.