hawaiidatascience / metaflowmics

C-MAIKI ITS and 16S pipelines
Apache License 2.0
3 stars 5 forks source link

problem in silva database download #6

Open Vikash84 opened 3 years ago

Vikash84 commented 3 years ago

Error executing process > 'pipeline_16S:DOWNLOAD_SILVA_FOR_MOTHUR (nr)'

Caused by: Process pipeline_16S:DOWNLOAD_SILVA_FOR_MOTHUR (nr) terminated with an error exit status (1)

Command executed:

wget https://mothur.s3.us-east-2.amazonaws.com/wiki/silva.nr_v138.tgz | tar xz

Command exit status: 1

Command output: (empty)

Command error: /bin/bash: line 1: cd: /home/vsingh/vdl/Goyal_Project_212/metaflowmics/metaflowmics/Pipeline-16S/work/97/81ca512e4c4dfef423187ded18c08f: No such file or directory /bin/bash: .command.sh: No such file or directory

Work dir: /home/vsingh/vdl/Goyal_Project_212/metaflowmics/metaflowmics/Pipeline-16S/work/97/81ca512e4c4dfef423187ded18c08f

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

Puumanamana commented 3 years ago

I'm not sure what's happening. Did you delete the work directory while the pipeline was running? How did you run the pipeline exactly?

gadepallivs commented 10 months ago

I had a similar error when executing the following command

nextflow run metaflowmics/pipeline-16S -profile local --reads fastq_data_metabolomics/*_R{1,2}*.fastq* --referenceAln databases/silva/v138/silva.nr_v138_1.align --referenceTax databases/silva/v138/silva.nr_v138_1.tax It began well

N E X T F L O W  ~  version 23.10.0
Launching `metaflowmics/pipeline-16S/main.nf` [disturbed_varahamihira] DSL2 - revision: f6a89bb7da

However, later, it threw a similar error above except that looking for .tax file. I have checked the databases folder, and I have both _silva.nr_v1381.align and the _silva.nr_v1381.tax files

ERROR ~ Error executing process > 'pipeline_16S:MOTHUR:DOWNLOAD_SILVA_FOR_MOTHUR (silva.nr_v138_1)'

Caused by:
  Missing output file(s) `*.tax` expected by process `pipeline_16S:MOTHUR:DOWNLOAD_SILVA_FOR_MOTHUR (silva.nr_v138_1)`

Command executed:

  wget -qO- https://mothur.s3.us-east-2.amazonaws.com/wiki/silva.nr_v138_1.tgz | tar xz

Command exit status:
  0

Command output:
  (empty)

Command error:
  .command.sh: line 2: wget: command not found

Work dir:
  /Users/***/Library/CloudStorage/OneDrive-***/Box Data/project_folder/metabolomics/metaflowmics_pipeline/work/c2/dd04e7bd47c1d91e58e0360e0c0f28

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details
gadepallivs commented 10 months ago

Also , documentation noted "For species assignments, the database is available on Zenodo (silva_species_assignment_v138.fa.gz)" How do I pass this file ? When downloaded and opened it, its a .fa file. Not sure how to use that as part of the pipeline. Appreciate it if you could provide some insight. Thank you

dcarothomas commented 7 months ago

@gadepallivs Did you find any solution for that error? I have the same one

gadepallivs commented 7 months ago

@gadepallivs Did you find any solution for that error? I have the same one

No @dcarothomas I could not get a solution.

Puumanamana commented 7 months ago

Hi @dcarothomas, @gadepallivs First, really sorry, I moved away from this project a few years ago and didn't see the issue you raised @gadepallivs. My first comment would be that the documentation is not be fully up to date and I'll try to dedicate some time to update it.

From what I see, I don't have dada2 species assignment included in the 16S pipeline -- only the mothur assigned taxonomies up to genus level. I might have some time to update that in the future, but for now, you would have to do it manually using the pipeline outputs. It's also not necessary to download any reference files to run the pipeline anymore (at least for the 16S pipeline), it should be automatically downloaded.

Finally, regarding your specific errors, it's probably due to the fact that you're running the pipeline locally without having the docker profile or the necessary packages installed. Do you have docker setup on your local machine? If yes, you should add the docker profile in your nextflow command like this (for example, running the test data in the github repo): nextflow run metaflowmics/pipeline-16S -profile local,docker --reads "tests/16S/*_R{1,2}*.fastq*" I just ran it on my machine and it worked.

EDIT: if you get an error due to CPU requirement not met on your machine, you can change that here and make sure all cpu statement are lower or equal to the number of CPUs on your machine