MaestSi / MetONTIIME

A Meta-barcoding pipeline for analysing ONT data in QIIME2 framework
GNU General Public License v3.0
78 stars 17 forks source link

Error executing process > 'downsampleFastq' #84

Closed erazzolini closed 10 months ago

erazzolini commented 10 months ago

Hello there,

I'm trying to run MetONTIIME on my data set (a few reads with ITS), but I have some problem with downsampleFastq (I think).

When I run:

nextflow -c metontiime2.conf run metontiime2.nf --resultsDir=/Users/emanuel/teste_microbioma_nfn/its/completo/results

I received the error:

me2.nf --resultsDir=/Users/emanuel/teste_microbioma_nfn/its/completo/results N E X T F L O W ~ version 23.10.0 Launching metontiime2.nf [determined_hugle] DSL2 - revision: 24f9bc59a5 executor > local (4) [16/b07488] process > importDb (1) [ 0%] 0 of 1 [88/5e64f8] process > concatenateFastq [100%] 1 of 1 ✔ [56/11796d] process > filterFastq [100%] 1 of 1 ✔ [50/f9dcff] process > downsampleFastq [ 0%] 0 of 1 [- ] process > importFastq - [- ] process > derepSeq - [- ] process > assignTaxonomy - [- ] process > filterTaxa - [- ] process > taxonomyVisualization - [- ] process > collapseTables - [- ] process > dataQC - executor > local (4) [- ] process > importDb (1) - [88/5e64f8] process > concatenateFastq [100%] 1 of 1 ✔ [56/11796d] process > filterFastq [100%] 1 of 1 ✔ [50/f9dcff] process > downsampleFastq [100%] 1 of 1, failed: 1 ✘ [- ] process > importFastq - [- ] process > derepSeq - [- ] process > assignTaxonomy - [- ] process > filterTaxa - [- ] process > taxonomyVisualization - [- ] process > collapseTables - [- ] process > dataQC - [- ] process > diversityAnalyses - ERROR ~ Error executing process > 'downsampleFastq'

Caused by: Process downsampleFastq terminated with an error exit status (1)

Command executed:

mkdir -p /Users/emanuel/teste_microbioma_nfn/its/completo/results/downsampleFastq fq=$(find /Users/emanuel/teste_microbioma_nfn/its/completo/results/filterFastq/ | grep ".fastq.gz$"); for f in $fq; do /Userssn=$(basename $f); seqtk sample $f 1000 | gzip > /Users/emanuel/teste_microbioma_nfn/its/completo/results/downsampleFastq/$sn done

Command exit status: 1

Command output: (empty)

Work dir: /Users/emanuel/teste_microbioma_nfn/its/completo/MetONTIIME/work/50/f9dcff9acfe9dcd5392ff383b3ef84

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details

I tryied a few things, like set to false, change the maxNumReads to 10, 100, 1000, 10000 and 100000, clusteringIdentity and put the files in the right folders, but anything works.

My .conf stay as:

   //Path to working directory including fastq.gz files
    workDir="/Users/emanuel/teste_microbioma_nfn/its/completo/barcode53/"
    //Path to sample metadata tsv file; if it doesn't exist yet, it is created at runtime
    sampleMetadata="/Users/emanuel/teste_microbioma_nfn/its/completo/sample-metadata.tsv"
    //Path to database file with sequences in fasta format
    dbSequencesFasta="/Users/emanuel/teste_microbioma_nfn/its/completo/unite_db/sh_refs_qiime_ver$
    //Path to database file with sequence id-to-taxonomy correspondence in tsv format
    dbTaxonomyTsv="/Users/emanuel/teste_microbioma_nfn/its/completo/unite_db/sh_taxonomy_qiime_ve$
    //Name of database file with sequences as QIIME2 artifact (qza); if it is already available, $
    dbSequencesQza="/Users/emanuel/teste_microbioma_nfn/its/completo/unite_db/unite_ver9_99_class$
    //Name of database file with sequence id-to-taxonomy correspondence as QIIME2 artifact (qza);$
    dbTaxonomyQza="/Users/emanuel/teste_microbioma_nfn/its/completo/unite_db/sh_taxonomy_qiime_ve$
    //Taxonomy classifier, available: VSEARCH, Blast
    classifier="Blast"
    //maxNumReads is the maximum number of reads per sample; if one sample has more than maxNumRe$
    maxNumReads=1000
    //minReadLength is the minimum length (bp) for a read to be retained
    minReadLength=200
    //maxReadLength is the maximum length (bp) for a read to be retained
    maxReadLength=5000
    //minQual is the minimum average PHRED score for a read to be retained
    minQual=10
    //Number of bases to be trimmed at both ends
    extraEndsTrim=0
    //Identity for de novo clustering [0-1]
    clusteringIdentity=0.9
    //Maximum number of candidate hits for each read, to be used for consensus taxonomy assignment
    maxAccepts=3
    //Minimum fraction of assignments must match top hit to be accepted as consensus assignment [$
    minConsensus=0.7
    //Minimum query coverage for an alignment to be considered a candidate hit [0-1]
    minQueryCoverage=0.8
    //Minimum alignment identity for an alignment to be considered a candidate hit [0-1]
    minIdentity=0.9
    //Taxonomy level at which you want to perform non-phylogeny-based diversity analyses
    taxaLevelDiversity=6  
    //Max num. reads for diversity analyses
    numReadsDiversity=500
    //Taxa of interest that you want to retain and to focus the analysis on
    taxaOfInterest=""
    //Minimum number of reads assigned to Taxa of interest to retain a sample
    minNumReadsTaxaOfInterest=1
    //Path to directory containing results
    resultsDir="/path/to/resultsDir"

    help=false

    // Flags to select which process to run
    concatenateFastq = true
    filterFastq = true    
    downsampleFastq = true
    importFastq = true   
    dataQC = true
    importDb = true
    derepSeq = true
    assignTaxonomy = true
    taxonomyVisualization = true
    collapseTables = true
    filterTaxa = false
    diversityAnalyses = true

And I set my /Users folder to work with docker.

Are there anything that I can do to fix it?

MaestSi commented 10 months ago

Hi, first I noticed that dbSequencesQza and dbTaxonomyQza variables do not point to the file name only, but include the full path, please leave only the base name (the full path is needed instead for dbSequencesFasta and dbTaxonomyTsv variables). Can you please show the content of /Users/emanuel/teste_microbioma_nfn/its/completo/results/filterFastq/ folder? Does it contain any fastq.gz files? Thanks, SM

erazzolini commented 10 months ago

Hello MaestSi, thank you for your time.

I changed the full path in dbSequencesQza and dbTaxonomyQza.

The folder filterFastq still empty, even if I copy the files from concatenatedFastq folder (fastq.gz files) and run MetONTIIME again, all the files in the filterFastq folder has been removed.

I'm trying to use in my macbook and a linux computer and the error still the same.

ERROR ~ Error executing process > 'downsampleFastq'

Caused by: Process downsampleFastq terminated with an error exit status (1)

Command executed:

mkdir -p /home/emanuelr/teste_microbioma_nfn/its/completo/results/downsampleFastq fq=$(find /home/emanuelr/teste_microbioma_nfn/its/completo/results/filterFastq/ | grep ".fastq.gz$"); for f in $fq; do sn=$(basename $f); done seqtk sample $f 1000 | gzip > /home/emanuelr/teste_microbioma_nfn/its/completo/results/downsampleFastq/$sn done

Command exit status: 1

Command output: (empty)

Work dir: /home/emanuelr/teste_microbioma_nfn/its/completo/MetONTIIME/work/d8/fb8dd90e6853aebd2a7674659c5981

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details

MaestSi commented 10 months ago

Does workDir="/Users/emanuel/teste_microbioma_nfn/its/completo/barcode53/" contain fastq.gz files still to be merged? If yes, it is ok to keep concatenateFastq = true, otherwise it should be set to false. Moreover, I think workDir should end one level before barcode53 folder, in other words you should try with: workDir="/Users/emanuel/teste_microbioma_nfn/its/completo/" Before trying this, please remove manifest.txt file and also sample-metadata.tsv file, if it was automatically generated by the pipeline. As a last point, did you remember to mount /Users directory, so that Docker can access it? Ciao, SM

erazzolini commented 10 months ago

Hello, It's me again.

I've changed the workdir to one level before barcode53 folder, but didn't work. I changed the concatenateFastq to false and do it by myself, but I still receive the same error. I made a test with Zyro control but didn't work too.

I'm attaching my .conf file. I've runned this file in tree different computers, one mac, one linux and one WSL windows (with docker running), but the error with downsample stills the same.

With docker I've mounted my /home and /Users folder.

image
MaestSi commented 10 months ago

Hi, if this is the command line you used: nextflow -c metontiime2.conf run metontiime2.nf --resultsDir=/Users/emanuel/teste_microbioma_nfn/its/completo/results you are missing -profile docker. SM

erazzolini commented 10 months ago

In one system (that one from conf file) i don't have permission to run with -profile docker, but in the other ones I used the full file, with -profile docker, but the error is the same.

MaestSi commented 10 months ago

If on one system you don't have either docker or singularity, you can't run the pipeline on that system. If you want, please provide the config file you used on the system with Docker and the command line you used. SM

erazzolini commented 10 months ago

I've made a few changes (like remove / in the last folder, copy the files and set false to a few steps), but now I found another error with importFasq that I think it's a qiime error?

ERROR ~ Error executing process > 'importFastq'

Caused by: Process importFastq terminated with an error exit status (132)

Command executed:

mkdir -p /Users/emanuel/teste_microbioma_nfn/its/completo/importFastq

fq=$(realpath $(find /Users/emanuel/teste_microbioma_nfn/its/completo/downsampleFastq/ -maxdepth 1 | grep ".fastq.gz")) manifestFile=/Users/emanuel/teste_microbioma_nfn/its/completo/importFastq/manifest.txt

if [ ! -f "$manifestFile" ]; then ln -s echo -e sample-id"te_mic"absolute-filepath > $manifestFile; for f in $fq; do mmand s=$(echo $(basename $f) | sed 's/.fastq.gz//g'); 132 echo -e $s" "$f >> $manifestFile; done fi

if [ ! -f /Users/emanuel/teste_microbioma_nfn/its/completo/sample-metadata.tsv ]; then mmand echo -e sample-id" "sample-name > /Users/emanuel/teste_microbioma_nfn/its/completo/sample-metadata.tsv; WARNINfor f in $fq; do ific ps=$(echo $(basename $f) | sed 's/.fastq.gz//g'); .commaecho -e $s" 23: "$s >> /Users/emanuel/teste_microbioma_nfn/its/completo/sample-metadata.tsv; h $mandone fi

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path $manifestFile --input-format 'SingleEndFastqManifestPhred33V2' to/MetONTIIME/wo--output-path /Users/emanuel/teste_microbioma_nfn/its/completo/importFastq/sequences.qza

ln -s /Users/emanuel/teste_microbioma_nfn/its/completo/importFastq/sequences.qza ./sequences.qza

Command exit status: 132

Command output: (empty)

Command error: WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested .command.sh: line 23: 13 Illegal instruction qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path $manifestFile --input-format 'SingleEndFastqManifestPhred33V2' --output-path /Users/emanuel/teste_microbioma_nfn/its/completo/importFastq/sequences.qza

Work dir: /Users/emanuel/teste_microbioma_nfn/its/completo/MetONTIIME/work/4d/9c06d0a952dbadf1ac3d3cc40a543f

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

-- Check '.nextflow.log' file for details

MaestSi commented 10 months ago

Hi, I don't think it's a QIIME error. It seems Docker is complaining that the MetONTIIME image was built for Linux/amd64, while you are running on a Linux/arm64/v8 platform. Do you have access to any other server with Linux/amd64 architecture with Docker or Singularity available? SM

MaestSi commented 10 months ago

Closing due to inactivity. SM

erazzolini commented 10 months ago

Dear MeastSi, Sorry for my delayed return, I spend some time to explain to service admin to allow me to use docker. Now, with all permissions, to folders and files are ok and I have a new error

Caused by: Process importFastq terminated with an error exit status (2)

Command executed:

mkdir -p /home/emanuelr/teste_microbioma_nfn/its/completo/importFastq

fq=$(realpath $(find /home/emanuelr/teste_microbioma_nfn/its/completo/downsampleFastq/ -maxdepth 1 | grep ".fastq.gz")) manifestFile=/home/emanuelr/teste_microbioma_nfn/its/completo/importFastq/manifest.txt

if [ ! -f "$manifestFile" ]; then echo -e sample-id" "absolute-filepath > $manifestFile; for f in $fq; do s=$(echo $(basename $f) | sed 's/.fastq.gz//g'); echo -e $s" "$f >> $manifestFile; done fi

if [ ! -f ]; then echo -e sample-id" "sample-name > ; for f in $fq; do s=$(echo $(basename $f) | sed 's/.fastq.gz//g'); echo -e $s" "$s >> ; done fi

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path $manifestFile --input-format 'SingleEndFastqManifestPhred33V2' --output-path /home/emanuelr/teste_microbioma_nfn/its/completo/importFastq/sequences.qza

ln -s /home/emanuelr/teste_microbioma_nfn/its/completo/importFastq/sequences.qza ./sequences.qza

Command exit status: 2

Command output: (empty)

Command error: .command.sh: line 16: syntax error near unexpected token `;'

MaestSi commented 10 months ago

Hi, I need the command you ran and the config file you used, together with the content of the directory containing fastq.gz files, thanks. SM

erazzolini commented 9 months ago

Hello,

I run the nextflow -c metontiime2.conf run metontiime2.nf -profile docker

In my original folder I have one file called barcode53.fastq (only one file)

In the other folders that nextflow create I have another file with the same name and in the importFastq folder I have only one manifest.txt file.

The log of my run:

nextflow.log

MaestSi commented 9 months ago

Hi, the fastq file should be gzip compressed, you should only have fastq.gz files. SM

erazzolini commented 9 months ago

Sorry, the file in the first folder where the fastq file for analysis comes from is in the format fastq.gz, in fact I tried using both ways, fastq and fastq.gz but the error remains the same.

Despite the command creating the folders, I believe that docker creates a folder with root permission, which I do not have access to and I cannot make changes to the folder, could this be a problem?

MaestSi commented 9 months ago

It looks like Docker is not configured to run without root privileges, could this be the case? Edit: Docker usually creates files and folders with root privileges, but you must be sure you can run docker run hello-world without sudo. SM