Open DrB-S opened 1 year ago
@DrB-S It looks like an issue with the extension of your fastq file name. The default is set to 'R1_001.fastq.gz'. Could you share an example name of the fastq files that you are trying to run?
Sure:
AB0313_S35_L001_R1_001.fastq.gz AB0313_S35_L001_R2_001.fastq.gz
Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor
Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @.***
On Jul 20, 2023, at 12:12 PM, Pooja Gupta @.***> wrote:
@DrB-S https://github.com/DrB-S It looks like an issue with the extension of your fastq file name. The default is set to 'R1_001.fastq.gz'. Could you share an example name of the fastq files that you are trying to run?
— Reply to this email directly, view it on GitHub https://github.com/UPHL-BioNGS/Wastewater-genomic-analysis/issues/2#issuecomment-1644460048, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTVLJV5YMZ3KFDRLSIOGYTXRF7LBANCNFSM6AAAAAA2O5NMP4. You are receiving this because you were mentioned.
-- CONFIDENTIALITY NOTICE: This e-mail is the property of the Arizona Department of Health Services and contains information that may be PRIVILEGED, CONFIDENTIAL, or otherwise exempt from disclosure by applicable law. It is intended only for the person(s) to whom it is addressed. If you have received this communication in error, please do not retain or distribute it. Please notify the sender immediately by e-mail at the address shown above and delete the original message. Thank you.
The file names look correct so it should ideally work. As the scripts rely on a specific directory structure we use here at UPHL, I would also make sure of that. Are you running just the viralrecon script? Could you please share the full log of the command you ran and its output?
No. I am running the script that calls all three scripts.
Here is my command-line:
sh ~/.nextflow/assets/UPHL-BioNGS/Wastewater-genomic-analysis/run_wwtp_sequencing_analysis.sh Wastewater_17Jul2023
And below is the output (not sure why singularity is not a configuration profile):
Purpose: Bash script to automate wastewater sequencing analysis. Consists of three individual scripts 1) WWP_seq_initialize_analysis.sh - Set up folder structure for running for sequencing data analysis and cleans up fastq filenames for NCBI submission. 2) run_viralrecon.sh - Run viralrecon bioinformatic pipeline with wastewater sequencing data. 3) run_freyja_vrn_noBoot.sh - Run Freyja with BAM files from viralrecon and generate final output files for Microreact visualization
Usage: run_wwtp_sequencing_analysis.sh
Last updated on June 05,2023
Thu Jul 20 15:26:45 MST 2023 : Step 1/3. Set up wastewater sequencing analysis
Purpose: 1) This is the first step in script run_wwtp_sequencing_analysis_v2 which sets up directory structure for initiating Wastewater sequencing run analysis and any downstream analysis. 3) Generate ncbi submission folder that can be directly used for uploading files to NCBI and create a csv file used for uploading into Data-flo to extract biosample and SRA metadata tables.
Usage:
sh WWP_seq_new_run_auto.sh
Thu Jul 20 15:26:45 MST 2023 : Fastq generation step is not yet completed for run Wastewater_17Jul2023. Exiting... Thu Jul 20 15:26:45 MST 2023 : Step 2/3. Run viralrecon
Purpose: Bash script to run viralrecon bioinformatic pipeline with wastewater sequencing data.
Usage: run_viralrecon.sh
Last updated on May 16,2023
Thu Jul 20 15:26:45 MST 2023 : Run Wastewater sample data with viralrecon for run Wastewater_17Jul2023 Thu Jul 20 15:26:45 MST 2023 : First create input samplesheet for viralrecon pipeline Thu Jul 20 15:26:45 MST 2023 : /data/Sequence_analysis/Wastewater-genomic-analysis//Wastewater_17Jul2023/analysis/viralrecon/Wastewater_17Jul2023_samplesheet.csv already exists, starting viralrecon Thu Jul 20 15:26:45 MST 2023 : Running viralrecon N E X T F L O W ~ version 23.04.2 Unknown configuration profile: 'singularity' Thu Jul 20 15:26:47 MST 2023 : Checking if the viralrecon pipeline completed successfully Thu Jul 20 15:26:47 MST 2023 : Oops .. something went wrong and pipeline stopped
=== I also notice that YPHL_viralrecon.config seems problematic when I view it in Visual Studio Code. It says, “Content is not allowed in prolog”.
Thanks for any suggestions,
Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor
Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @.***
On Jul 20, 2023, at 1:54 PM, Pooja Gupta @.***> wrote:
The file names look correct so it should ideally work. As the scripts rely on a specific directory structure we use here at UPHL, I would also make sure of that. Are you running just the viralrecon script? Could you please share the full log of the command you ran and its output?
— Reply to this email directly, view it on GitHub https://github.com/UPHL-BioNGS/Wastewater-genomic-analysis/issues/2#issuecomment-1644591872, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTVLJVF3FDSHHMUNBX4TGTXRGLHHANCNFSM6AAAAAA2O5NMP4. You are receiving this because you were mentioned.
-- CONFIDENTIALITY NOTICE: This e-mail is the property of the Arizona Department of Health Services and contains information that may be PRIVILEGED, CONFIDENTIAL, or otherwise exempt from disclosure by applicable law. It is intended only for the person(s) to whom it is addressed. If you have received this communication in error, please do not retain or distribute it. Please notify the sender immediately by e-mail at the address shown above and delete the original message. Thank you.
I am running the wastewater pipeline. I have created the sample sheet, and it is in the dir. However the script cannot find the sample sheet, and when it tries to create a new one, I get the following error in line 96 of fastq_dir_to_samplesheet.py:
Tue Jul 18 13:24:18 MST 2023 : Run Wastewater sample data with viralrecon for run Wastewater_17Jul2023 Tue Jul 18 13:24:18 MST 2023 : First create input samplesheet for viralrecon pipeline Tue Jul 18 13:24:18 MST 2023 : Wastewater_17Jul2023_samplesheet.csv does not exist. Creating samplesheet required to run viralrecon File "/data/home/becksts/.nextflow/assets/UPHL-BioNGS/Wastewater-genomic-analysis/conf-files/fastq_dir_to_samplesheet.py", line 96 glob.glob(os.path.join(fastq_dir, f"*{extension}"), recursive=False) ^ SyntaxError: invalid syntax Tue Jul 18 13:24:18 MST 2023 : Checking if the viralrecon pipeline completed successfully Tue Jul 18 13:24:18 MST 2023 : Oops .. something went wrong and pipeline stopped