nf-core / ampliseq

Amplicon sequencing analysis workflow using DADA2 and QIIME2
https://nf-co.re/ampliseq
MIT License
185 stars 115 forks source link

Pipeline error #55

Closed yanxianl closed 5 years ago

yanxianl commented 5 years ago

Hi,

I tried to run the nextflow pipeline for a small dataset but encountered an error.

The commands I used:

nextflow run nf-core/rrna-ampliseq \
  -profile standard,docker \
  -name "test1" \
  -r 1.0.0 \
  --reads '/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/casava-18-paired-end-demultiplexed' \
  --untilQ2import  \
  --Q2imported  \
  --FW_primer GTGCCAGCMGCCGCGGTAA \
  --RV_primer GGACTACHVGGGTWTCTAAT \
  --trunclenf 239 \
  --trunclenr 230 \
  --retain_untrimmed \
  --metadata "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/metadata.tsv"\
  --metadata_category "Diet,Compartment" \
  --exclude_taxa "mitochondria,chloroplast" \
  --outdir "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow/" \
  --email "yanxianl@nmbu.no" \
  --max_memory '16.GB' \
  --max_cpus 12 

The error message:

ERROR ~ No signature of method: static nextflow.Channel.fromFile() is applicable for argument types: (org.codehaus.groovy.runtime.GStringImpl) values: [true]
Possible solutions: from([Ljava.lang.Object;), from(java.util.Collection), fromPath(java.lang.Object)

 -- Check script 'main.nf' at line: 129 or see '.nextflow.log' file for more details

My java version:

java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)

Regards, Yanxian

d4straub commented 5 years ago

Issues that I see with your command: (1) The pipeline was renamed to nf-core/ampliseq, sorry for the confusion. (2) --untilQ2import and --Q2imported are mutually exclusive, should mention this maybe. (3) --Q2imported expects a path.

Could you please attempt to run the following minimal test command: nextflow run -r 1.0.0 nf-core/ampliseq -profile test,docker Shouldnt take longer than 20 min and we know if it runs. If it worked, try:

nextflow run -r 1.0.0 nf-core/ampliseq \
  -profile standard,docker \
  --reads '/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/casava-18-paired-end-demultiplexed' \
  --untilQ2import  \
  --FW_primer GTGCCAGCMGCCGCGGTAA \
  --RV_primer GGACTACHVGGGTWTCTAAT \
  --trunclenf 239 \
  --trunclenr 230 \
  --retain_untrimmed \
  --metadata "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/metadata.tsv" \
  --metadata_category "Diet,Compartment"

edit: easier to read

yanxianl commented 5 years ago

Hi, I ran the minimal test command but still got errors:

N E X T F L O W  ~  version 18.10.1
Pulling nf-core/ampliseq ...
 downloaded from https://github.com/nf-core/ampliseq.git
Launching `nf-core/ampliseq` [agitated_goldstine] - revision: f0357d61cf [1.0.0]
=======================================================
                                          ,--./,-.
          ___     __   __   __   ___     /,-._.--~'
    |\ | |__  __ /  ` /  \ |__) |__         }  {
    | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                          `._,._,'

nf-core/ampliseq v1.0.0"
=======================================================
Pipeline Name  : nf-core/ampliseq
Pipeline Version: 1.0.0
Run Name       : agitated_goldstine
Reads          : data/*_L001_R{1,2}_001.fastq.gz
Max Memory     : 6 GB
Max CPUs       : 2
Max Time       : 2d
Output dir     : ./results
Working dir    : /home/nutrition_group/work
Container Engine: docker
Container      : nfcore/ampliseq:1.0.0
Current home   : /home/nutrition_group
Current user   : nutrition_group
Current path   : /home/nutrition_group
Script dir     : /home/nutrition_group/.nextflow/assets/nf-core/ampliseq
Config Profile : test,docker
=========================================
[warm up] executor > local
[2a/2d1ead] Submitted process > get_software_versions
[20/08699e] Submitted process > output_documentation (1)
ERROR ~ Error executing process > 'get_software_versions'

Caused by:
  Process `get_software_versions` terminated with an error exit status (126)

Command executed:

  echo 1.0.0 > v_pipeline.txt
  echo 18.10.1 > v_nextflow.txt
  fastqc --version > v_fastqc.txt
  multiqc --version > v_multiqc.txt
  scrape_software_versions.py > software_versions_mqc.yaml

Command exit status:
  126

Command output:
  (empty)

Command error:
  docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.38/containers/create?name=nxf-zFr97dCpHAADbqIMAWRIisfi: dial unix /var/run/docker.sock: connect: permission denied.
  See 'docker run --help'.

Work dir:
  /home/nutrition_group/work/2a/2d1ead49e43fdef5dc287420fdf69e

Tip: when you have fixed the problem you can continue the execution appending to the nextflow command line the option `-resume`

 -- Check '.nextflow.log' file for details
Execution cancelled -- Finishing pending tasks before exit
[nf-core/ampliseq] Pipeline Complete

Do I need to install tools used in the nf pipeline under my home directory where nextflow was installed? According to my understanding, this's seems to be unnecessary.

d4straub commented 5 years ago

Seems to me like a docker-related issue. For running nextflow with containers you have to have singularity or docker installed, you specified docker to execute the container. Is docker installed?

apeltzer commented 5 years ago

Yes, this is a Docker installation / configuration issue. Please follow the Docker documentation first on how to start/test your local Docker installation for example here: https://docs.docker.com/install/linux/docker-ce/ubuntu/

yanxianl commented 5 years ago

Hi, thanks! The docker was installed and tested as suggested. Here's what I got when I typed in the test command:

sudo docker run hello-world
[sudo] password for nutrition_group: 

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

But if I typed in the second test command, there's a permission deny problem. Could this be related to the error?

docker run -it ubuntu bash
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.38/containers/create: dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.
nutrition_group@Nutrition-group-ThinkStation-P710[nutrition_group] docker run -it ubuntu bash
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.38/containers/create: dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.
d4straub commented 5 years ago

Follow these steps as well to solve the permission problem: https://docs.docker.com/install/linux/linux-postinstall/#manage-docker-as-a-non-root-user

d4straub commented 5 years ago

Does it work for you now?

yanxianl commented 5 years ago

Hi, sorry for the late reply. After proper configurations of Docker, I ran the test commands without any problems. nextflow run -r 1.0.0 nf-core/ampliseq -profile test,docker

Next, I ran the command block you suggested.

nextflow run -r 1.0.0 nf-core/ampliseq \
  -profile standard,docker \
  --reads '/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/casava-18-paired-end-demultiplexed' \
  --untilQ2import  \
  --FW_primer GTGCCAGCMGCCGCGGTAA \
  --RV_primer GGACTACHVGGGTWTCTAAT \
  --trunclenf 239 \
  --trunclenr 230 \
  --retain_untrimmed \
  --metadata "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/metadata.tsv" \
  --metadata_category "Diet,Compartment"

The pipeline proceeds to qiime_demux_visualize with no further actions. I could find the html-based interactive sequence quality plot under the results folder and determine the truncate length for my reads. But I was not able to find the imported qza file to be supplied to the arguement --Q2imported. How can I proceed?

Many thanks!

d4straub commented 5 years ago

Hi,

you stumbled there about an inconvenience, I have opened an issue with an easy fix for the next release. For now, there are three ways for you to continue:

Hope that helps

yanxianl commented 5 years ago

Hi, I started over again but I had a problem in making the SILVA classifier.

nextflow run -r 1.0.0 nf-core/ampliseq \
  -profile standard,docker \
  --reads '/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/qiime2/casava-18-paired-end-demultiplexed' \
  --FW_primer GTGCCAGCMGCCGCGGTAA \
  --RV_primer GGACTACHVGGGTWTCTAAT \
  --trunclenf 239 \
  --trunclenr 230 \
  --retain_untrimmed \
  --metadata "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/qiime2/metadata.tsv"\
  --metadata_category "Diet" \
  --exclude_taxa "mitochondria,chloroplast" \
  --outdir "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow/" \
  --email "yanxianl@nmbu.no" \
  --max_memory '16.GB' \
  --max_cpus 12 

Here's the error message:

Error executing process > 'make_SILVA_132_16S_classifier (1)'
Caused by:
  /home/nutrition_group/work/b1/cc7cc488152871d1eb50e240369425/SILVA_132_QIIME_release

I was asked to check the log file but I couldn't make sense of it. Here's the log file nextflow.log

d4straub commented 5 years ago

Hi,

this seems to me like a user right problem, indicated by this line in the log: java.nio.file.AccessDeniedException: /home/nutrition_group/work/b1/cc7cc488152871d1eb50e240369425/SILVA_132_QIIME_release I dont think its a specific problem of this pipeline.

@apeltzer: any idea how to tackle this issue?

apeltzer commented 5 years ago

Sounds a bit weird to me too - can you use the Singularity container please and try again?

-profile standard,singularity

If that doesn't work, we can at least easily check why the permissions don't fit.

yanxianl commented 5 years ago

Thanks! I'll install singularity and run the pipeline again and see if that works. I'll update as soon as I test it.

yanxianl commented 5 years ago

Hi, I may have figured out the problem. The silva classifier requires a much higher memory usage than the greengenes does, which in my case needed 35 GB of memory that exceeded the available system memory (31 GB). Other qiime2 users have also reported similar issues.

The following is the error message when I didn't restrict the usage of memory and cpus.

ERROR ~ Error executing process > 'make_SILVA_132_16S_classifier (1)'

Caused by:
  Process requirement exceed available memory -- req: 35 GB; avail: 31.3 GB

Command executed:

  unzip -qq Silva_132_release.zip

          fasta="SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna"
          taxonomy="SILVA_132_QIIME_release/taxonomy/16S_only/99/consensus_taxonomy_7_levels.txt"

          if [ "false" = "true" ]; then
            sed 's/#//g' $taxonomy >taxonomy-99_removeHash.txt
            taxonomy="taxonomy-99_removeHash.txt"
            echo "
  ######## WARNING! The taxonomy file was altered by removing all hash signs!"
          fi

        ### Import
        qiime tools import --type 'FeatureData[Sequence]'       --input-path $fasta         --output-path ref-seq-99.qza
        qiime tools import --type 'FeatureData[Taxonomy]'       --source-format HeaderlessTSVTaxonomyFormat         --input-path $taxonomy      --output-path ref-taxonomy-99.qza

        #Extract sequences based on primers
        qiime feature-classifier extract-reads      --i-sequences ref-seq-99.qza        --p-f-primer GTGCCAGCMGCCGCGGTAA        --p-r-primer GGACTACHVGGGTWTCTAAT       --o-reads GTGCCAGCMGCCGCGGTAA-GGACTACHVGGGTWTCTAAT-99-ref-seq.qza         --quiet

        #Train classifier
        qiime feature-classifier fit-classifier-naive-bayes         --i-reference-reads GTGCCAGCMGCCGCGGTAA-GGACTACHVGGGTWTCTAAT-99-ref-seq.qza     --i-reference-taxonomy ref-taxonomy-99.qza      --o-classifier GTGCCAGCMGCCGCGGTAA-GGACTACHVGGGTWTCTAAT-99-classifier.qza         --quiet

Command exit status:
  -

Command output:
  (empty)

Work dir:
  /home/nutrition_group/work/5a/dbc35fb6d02f6ba473c43b9b5a3d74

Tip: when you have fixed the problem you can continue the execution appending to the nextflow command line the option `-resume`

 -- Check '.nextflow.log' file for details
Execution cancelled -- Finishing pending tasks before exit

I resubmitted the workflow by skipping the taxonomic classification --skip_taxonomy but I got the same error.

d4straub commented 5 years ago

Skipping taxonomic classification should prevent this to happen, because this process shouldn't be executed at all with --skip_taxonomy. I'll look into that soon.

Since you seem to use the 515f/806r primer pair you can download the qiime2 classifier here: https://docs.qiime2.org/2018.6/data-resources/ named "Silva 132 99% OTUs from 515F/806R region of sequences". Please verify yourself that this is the right primer pair. Download the classifier and input it into the pipeline with --classifier [path] and append -resume to your command that you dont need to re-calculate all steps. Hope that helps.

I am a little surprised that nextflow mentions the 35Gb specified in conf/base.config, I'll look into that as well, since in some cases this process takes actually significantly less memory.

d4straub commented 5 years ago

I tested the pipeline on my laptop (4 cpus, 16Gb RAM) with the following parameters: -profile standard,singularity --reads "data" --FW_primer GTGYCAGCMGCCGCGGTAA --RV_primer GGACTACNVGGGTWTCTAAT --metadata "Metadata.tsv" --classifier "GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-99-classifier.qza" which resulted in the error you describe but for another process (dada_single).

I resumed the pipeline with the parameters -profile standard,singularity --reads "data" --FW_primer GTGYCAGCMGCCGCGGTAA --RV_primer GGACTACNVGGGTWTCTAAT --metadata "Metadata.tsv" --classifier "GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-99-classifier.qza" --max_cpus 4 --max_memory '15.GB' -resume that finished successfully.

I cant find here any bug or unintended difficulty.

Training the SILVA 132 classifier requires ~33 Gb memory, no way around that. Alternatives are to use a pre-trained classifier (has to be trained with same primers as in PCR) as I explained above or train a smaller one (using the non-documented and not recommended hidden parameter --dereplication 97 or even --dereplication 94 for an even smaller one).

edit: with 30 Gb memory you will be limited by sample and read numbers in your analysis. The maximum memory that I encountered yet were 63Gb for classification of ~100 highly divers samples. Just to clarify that.

yanxianl commented 5 years ago

Hi, I used the pre-trained SILVA132 classifier from the QIIME2 website and proceeded with the taxonomic classification:

nextflow run -r 1.0.0 -resume nf-core/ampliseq \
  -profile standard,docker \
  --reads '/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/qiime2/casava-18-paired-end-demultiplexed' \
  --FW_primer GTGCCAGCMGCCGCGGTAA \
  --RV_primer GGACTACHVGGGTWTCTAAT \
  --trunclenf 239 \
  --trunclenr 230 \
  --classifier "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow/silva-132-99-515-806-nb-classifier.qza" \
  --retain_untrimmed \
  --metadata "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/qiime2/metadata.tsv"\
  --metadata_category "Diet" \
  --exclude_taxa "mitochondria,chloroplast" \
  --outdir "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow/"

However, the taxonomic classification task was killed as it exceeded the maximum run time.

ERROR ~ Error executing process > 'classifier (1)'

Caused by:
  Process exceeded running time limit (2h)

Command executed:

qiime feature-classifier classify-sklearn   
--i-classifier silva-132-99-515-806-nb-classifier.qza   
--p-n-jobs 8    
--i-reads rep-seqs.qza      
--o-classification taxonomy.qza     
--verbose

 qiime metadata tabulate    
--m-input-file taxonomy.qza     
--o-visualization taxonomy.qzv      
--verbose

#produce "taxonomy/taxonomy.tsv"
  qiime tools export taxonomy.qza   --output-dir taxonomy
  qiime tools export taxonomy.qzv   --output-dir taxonomy

Command exit status:
  -

Command output:
  (empty)

Command error:
  WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.

Work dir:
  /home/nutrition_group/work/ad/bae28755d33bbf2e878cc7ab37d0df

A similar issue was also reported on the QIIME2 forum regarding the memory usage problme with the silva classifier. To find out if the system memory is enough for the task, I trained the silva classifier and used it for the taxonomic classification within QIIME2-2018.11 using the same workstation where I ran the nextflow. All the command lines below worked without problems. The training of classifier took 1h 41 minutes and the taxonomy classification took 17 minutes. The rep-seqs-dada2.qza contains 1399 unique representative sequences. As you've pointed out, it shouldn't require massive memory usage and take more than 2 hours to be finished. I can't figure out why it's a problem when I use the nextflow pipeline.

# Download silva132 database
wget https://www.arb-silva.de/fileadmin/silva_databases/qiime/Silva_132_release.zip
unzip Silva_132_release.zip -d silva_132

# Import reference sequence and taxonomy to train the feature-classifier
qiime tools import \
--type 'FeatureData[Sequence]' \
--input-path silva_132/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna \ 
--output-path 99-otus-silva.qza

qiime tools import \
--type 'FeatureData[Taxonomy]' \
--input-format HeaderlessTSVTaxonomyFormat \
--input-path silva_132/SILVA_132_QIIME_release/taxonomy/16S_only/99/consensus_taxonomy_7_levels.txt \
--output-path ref-taxonomy-silva.qza

# Extract V4 reference reads
qiime feature-classifier extract-reads \
--i-sequences 99-otus-silva.qza \
--p-f-primer GTGCCAGCMGCCGCGGTAA \
--p-r-primer GGACTACHVGGGTWTCTAAT \
--o-reads ref-seqs-silva.qza

# Train the classifier [ 3:09]
qiime feature-classifier fit-classifier-naive-bayes \
--i-reference-reads ref-seqs-silva.qza \
--i-reference-taxonomy ref-taxonomy-silva.qza \
--o-classifier silva132-99otus-515-806-classifier.qza
Saved TaxonomicClassifier to: silva132-99otus-515-806-classifier.qza [ 4:50]

# Assign taxanomy using the trained featureClassifier
qiime feature-classifier classify-sklearn \
--i-classifier silva132-99otus-515-806-classifier.qza \
--i-reads rep-seqs-dada2.qza \
--o-classification taxonomy-silva.qza
Saved FeatureData[Taxonomy] to: taxonomy-silva.qza  [ 5:07]
d4straub commented 5 years ago

Could you try the former nextflow command with singularity instead of docker? I hope that gives better results.

Training the classifier should always need ~33 Gb RAM, maybe you succeeded with your latter commands because its an edge case and swapping helped.

yanxianl commented 5 years ago

Hi, I tried the former command again using the docker image and it worked all the way until beta-diversity, which displyed a similar error message:

ERROR ~ Error executing process > 'beta_diversity (weighted_unifrac_distance_matrix)'

Caused by:
  Missing output file(s) `beta-diversity/*` expected by process `beta_diversity (weighted_unifrac_distance_matrix)`

Command executed:

  IFS=',' read -r -a metacategory <<< ""

  for j in "${metacategory[@]}"
  do
qiime diversity beta-group-significance 
--i-distance-matrix weighted_unifrac_distance_matrix.qza            
--m-metadata-file metadata.tsv          
--m-metadata-column "$j"            
--o-visualization weighted_unifrac_distance_matrix-$j.qzv           
--p-pairwise

qiime tools export weighted_unifrac_distance_matrix-$j.qzv          
--output-dir beta-diversity/weighted_unifrac_distance_matrix-$j
     done

Command exit status:
  0

Command output:
  (empty)

Command error:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.

Work dir:
  /home/nutrition_group/work/f8/851397a1b7645fd84567dd3b2fa6ab

I also ran the former command block using the singularity image but I got a different error:

WARN: 
Singularity cache directory has not been defined 
-- Remote image will be stored in the path: /home/nutrition_group/work/singularity
Pulling Singularity image docker://nfcore/ampliseq:1.0.0 [cache /home/nutrition_group/work/singularity/nfcore-ampliseq-1.0.0.img]
[09/6a10c1] Submitted process > metadata_category_pairwise (1)
[86/80c650] Submitted process > fastqc (BCON1_S1)
[58/9cd210] Submitted process > output_documentation (1)
[7e/1e45e4] Submitted process > fastqc (BCON2_S2)
ERROR ~ Error executing process > 'metadata_category_pairwise (1)'

Caused by:
  Process `metadata_category_pairwise (1)` terminated with an error exit status (1)

Command executed:

  metadataCategoryPairwise.r metadata.tsv

Command exit status:
  1

Command output:
  (empty)

Command error:
  Error in file(file, "rt") : cannot open the connection
  Calls: read.delim -> read.table -> file
  In addition: Warning message:
  In file(file, "rt") :
    cannot open file 'metadata.tsv': No such file or directory
  Execution halted

Work dir:
  /home/nutrition_group/work/09/6a10c1dc7f8e4877da55ae7645a06b

Tip: when you have fixed the problem you can continue the execution appending to the nextflow command line the option `-resume`

 -- Check '.nextflow.log' file for details
Execution cancelled -- Finishing pending tasks before exit
[86/80c650] NOTE: Missing output file(s) `*_fastqc.{zip,html}` expected by process `fastqc (BCON1_S1)` -- Error is ignored
[7e/1e45e4] NOTE: Missing output file(s) `*_fastqc.{zip,html}` expected by process `fastqc (BCON2_S2)` -- Error is ignored
d4straub commented 5 years ago

The docker container indeed seems to have problems with your machine. @apeltzer are we going to attempt to fix this or should we maybe highlight somewhere that only singularity is supported?

With singularity simply the metadata file is not found. Could you verify the path? I have already implemented an early check for file existence for a future release.

yanxianl commented 5 years ago

Hi, the metadata file path was correct. Anyway, I changed the file path and ran the workflow using the singularity image.

nextflow run -r 1.0.0 nf-core/ampliseq \
  -resume \
  -profile standard,singularity \
  --reads '/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/qiime2/casava-18-paired-end-demultiplexed' \
  --FW_primer GTGCCAGCMGCCGCGGTAA \
  --RV_primer GGACTACHVGGGTWTCTAAT \
  --trunclenf 239 \
  --trunclenr 230 \
  --classifier "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow/silva-132-99-515-806-nb-classifier.qza" \
  --retain_untrimmed \
  --metadata "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow-singularity/metadata.tsv"\
  --metadata_category "Diet" \
  --exclude_taxa "mitochondria,chloroplast" \
  --outdir "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow-singularity/"

Here's another error:

[b5/641ce4] NOTE: Missing output file(s) `*_fastqc.{zip,html}` expected by process `fastqc (BCON1_S1)` -- Error is ignored
[6c/a0af28] Submitted process > trimming (BCON3_S3)
ERROR ~ Error executing process > 'trimming (CONT3_S6)'

Caused by:
  Process `trimming (CONT3_S6)` terminated with an error exit status (2)

Command executed:

  mkdir -p trimmed
  cutadapt -g GTGCCAGCMGCCGCGGTAA -G GGACTACHVGGGTWTCTAAT              -o trimmed/CONT3_S6_L001_R1_001.fastq.gz -p trimmed/CONT3_S6_L001_R2_001.fastq.gz             CONT3_S6_L001_R1_001.fastq.gz CONT3_S6_L001_R2_001.fastq.gz 2> cutadapt_log_CONT3_S6_L001_R1_001.fastq.txt

Command exit status:
  2

Command output:
  (empty)

Work dir:
  /home/nutrition_group/work/91/33bc3e5e3b0dc2039bbd72011da292

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details
Execution cancelled -- Finishing pending tasks before exit
[3e/4e63ee] NOTE: Missing output file(s) `*_fastqc.{zip,html}` expected by process `fastqc (CONT2_S5)` -- Error is ignored
d4straub commented 5 years ago

I see two possible problems with your command:

change the line (see the last "/") --outdir "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow-singularity/" to --outdir "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow-singularity/results"

and (see the space before last "\") --metadata "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow-singularity/metadata.tsv"\ to --metadata "/home/nutrition_group/desktop/data/Yanxian/misc/beta-conglycinin/16s/nextflow-singularity/metadata.tsv" \

Please make sure your spelling is correct.

yanxianl commented 5 years ago

Hi, the error in computing the beta-diversity using the docker image was due to the wrong metadata format, which was made for qiime2-2018.11. After reformatting of the metadata, the pipeline was finished without any problems.

I also followed the instruction in the docker documentation to fix the warning message: WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.

I tried to repeat the workflow using the singularity image but without sucess. I'll open a new issue.

ps: The results produced by the ampliseq are just fabulous! Thank you so much for your time and patience!

d4straub commented 5 years ago

Good to hear that it worked out so far. Feel free to open a new issue when needed.

gabyrech commented 5 years ago

Hi, the error in computing the beta-diversity using the docker image was due to the wrong metadata format, which was made for qiime2-2018.11. After reformatting of the metadata, the pipeline was finished without any problems.

I also followed the instruction in the docker documentation to fix the warning message: WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.

I tried to repeat the workflow using the singularity image but without sucess. I'll open a new issue.

ps: The results produced by the ampliseq are just fabulous! Thank you so much for your time and patience!

hi @yanxianl! Could you please tell me what was your error in the metadata file? I am getting the same error with the beta-diversity calculation. Thanks!

yanxianl commented 5 years ago

Hi, my metadata was made for QIIME2-2018.11, which is not compatible with the QIIME2-2018.6 implemented by the current nextflow pipeline. I reformatted the metadata according to the QIIME2-2018.6 documentation and it worked. Yanxian