alesssia / YAMP

YAMP: Yet Another Metagenomic Pipeline
GNU General Public License v3.0
56 stars 28 forks source link

Error on running test #21

Closed ParsaGhadermazi closed 3 years ago

ParsaGhadermazi commented 3 years ago

Hi, I tried to run the following on a cluster, but I keep getting errors when it comes to alpha diversity. I'm using singularity:

./nextflow run YAMP.nf -profile test,singularity

`Error executing process > 'alpha_diversity (test)'

Caused by:
  Process `alpha_diversity (test)` terminated with an error exit status (1)

Command executed:

  #It checks if the profiling was successful, that is if identifies at least three species
  n=$(grep -o s__ test.biom | wc -l  | cut -d" " -f 1)
  if (( n <= 3 )); then
        #The file should be created in order to be returned
        touch test_alpha_diversity.tsv
  else
        echo test > test_alpha_diversity.tsv
        qiime tools import --input-path test.biom --type 'FeatureTable[Frequency]' --input-format BIOMV100Format --output-path test_abundance_table.qza
        for alpha in ace berger_parker_d brillouin_d chao1 chao1_ci dominance doubles enspie esty_ci fisher_alpha gini_index goods_coverage heip_e kempton_taylor_q lladser_pe margalef mcintosh_d mcintosh_e menhinick michaelis_menten_fit osd pielou_e robbins shannon simpson simpson_e singles strong
        do
                qiime diversity alpha --i-table test_abundance_table.qza --p-metric $alpha --output-dir $alpha &> /dev/null
                qiime tools export --input-path $alpha/alpha_diversity.qza --output-path ${alpha} &> /dev/null
                value=$(sed -n '2p' ${alpha}/alpha-diversity.tsv | cut -f 2)
            echo -e  $alpha'    '$value
        done >> test_alpha_diversity.tsv
  fi

  # MultiQC doesn't have a module for qiime yet. As a consequence, I
  # had to create a YAML file with all the info I need via a bash script
  bash generate_alpha_diversity_log.sh ${n} > alpha_diversity_mqc.yaml

Command exit status:
  1

Command output:
  Imported test.biom as BIOMV100Format to test_abundance_table.qza

Command wrapper:
  Imported test.biom as BIOMV100Format to test_abundance_table.qza

Work dir:
  /projects/$USER/YAMP/work/9c/1550a194e28e073cc00ae70983fd2b

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

I noticed that there is a closed issue about this but non of the discussed solutions worked for me.

If I run bash .command.run outside of any container I'll get the following:

.command.sh: line 9: qiime: command not found
sed: can't read ace/alpha-diversity.tsv: No such file or directory
sed: can't read berger_parker_d/alpha-diversity.tsv: No such file or directory
sed: can't read brillouin_d/alpha-diversity.tsv: No such file or directory
sed: can't read chao1/alpha-diversity.tsv: No such file or directory
sed: can't read chao1_ci/alpha-diversity.tsv: No such file or directory
sed: can't read dominance/alpha-diversity.tsv: No such file or directory
sed: can't read doubles/alpha-diversity.tsv: No such file or directory
sed: can't read enspie/alpha-diversity.tsv: No such file or directory
sed: can't read esty_ci/alpha-diversity.tsv: No such file or directory
sed: can't read fisher_alpha/alpha-diversity.tsv: No such file or directory
sed: can't read gini_index/alpha-diversity.tsv: No such file or directory
sed: can't read goods_coverage/alpha-diversity.tsv: No such file or directory
sed: can't read heip_e/alpha-diversity.tsv: No such file or directory
sed: can't read kempton_taylor_q/alpha-diversity.tsv: No such file or directory
sed: can't read lladser_pe/alpha-diversity.tsv: No such file or directory
sed: can't read margalef/alpha-diversity.tsv: No such file or directory
sed: can't read mcintosh_d/alpha-diversity.tsv: No such file or directory
sed: can't read mcintosh_e/alpha-diversity.tsv: No such file or directory
sed: can't read menhinick/alpha-diversity.tsv: No such file or directory
sed: can't read michaelis_menten_fit/alpha-diversity.tsv: No such file or directory
sed: can't read osd/alpha-diversity.tsv: No such file or directory
sed: can't read pielou_e/alpha-diversity.tsv: No such file or directory
sed: can't read robbins/alpha-diversity.tsv: No such file or directory
sed: can't read shannon/alpha-diversity.tsv: No such file or directory
sed: can't read simpson/alpha-diversity.tsv: No such file or directory
sed: can't read simpson_e/alpha-diversity.tsv: No such file or directory
sed: can't read singles/alpha-diversity.tsv: No such file or directory
sed: can't read strong/alpha-diversity.tsv: No such file or directory
bash: generate_alpha_diversity_log.sh: No such file or directory

If I run this in qiime's singularity image I get:

Imported test.biom as BIOMV100Format to test_abundance_table.qza
sed: can't read ace/alpha-diversity.tsv: No such file or directory
sed: can't read berger_parker_d/alpha-diversity.tsv: No such file or directory
sed: can't read brillouin_d/alpha-diversity.tsv: No such file or directory
sed: can't read chao1/alpha-diversity.tsv: No such file or directory
sed: can't read chao1_ci/alpha-diversity.tsv: No such file or directory
sed: can't read dominance/alpha-diversity.tsv: No such file or directory
sed: can't read doubles/alpha-diversity.tsv: No such file or directory
sed: can't read enspie/alpha-diversity.tsv: No such file or directory
sed: can't read esty_ci/alpha-diversity.tsv: No such file or directory
sed: can't read fisher_alpha/alpha-diversity.tsv: No such file or directory
sed: can't read gini_index/alpha-diversity.tsv: No such file or directory
sed: can't read goods_coverage/alpha-diversity.tsv: No such file or directory
sed: can't read heip_e/alpha-diversity.tsv: No such file or directory
sed: can't read kempton_taylor_q/alpha-diversity.tsv: No such file or directory
sed: can't read lladser_pe/alpha-diversity.tsv: No such file or directory
sed: can't read margalef/alpha-diversity.tsv: No such file or directory
sed: can't read mcintosh_d/alpha-diversity.tsv: No such file or directory
sed: can't read mcintosh_e/alpha-diversity.tsv: No such file or directory
sed: can't read menhinick/alpha-diversity.tsv: No such file or directory
sed: can't read michaelis_menten_fit/alpha-diversity.tsv: No such file or directory
sed: can't read osd/alpha-diversity.tsv: No such file or directory
sed: can't read pielou_e/alpha-diversity.tsv: No such file or directory
sed: can't read robbins/alpha-diversity.tsv: No such file or directory
sed: can't read shannon/alpha-diversity.tsv: No such file or directory
sed: can't read simpson/alpha-diversity.tsv: No such file or directory
sed: can't read simpson_e/alpha-diversity.tsv: No such file or directory
sed: can't read singles/alpha-diversity.tsv: No such file or directory
sed: can't read strong/alpha-diversity.tsv: No such file or directory
/curc/sw/lmod/lmod/init/bash: line 82: /bin/tr: No such file or directory
bash: generate_alpha_diversity_log.sh: No such file or directory

Could you please help me solve this? Thanks Parsa

alesssia commented 3 years ago

Hi @ParsaGhadermazi ,

what I think it is happening (I never witness this issue before) is that qiime2 is not calculating and/or exporting the alpha diversity measures.

Could you please edit the following lines in YAMP.nf (alpha_diversity process, lines 856-7):

qiime diversity alpha --i-table ${name}_abundance_table.qza --p-metric \$alpha --output-dir \$alpha &> /dev/null
qiime tools export --input-path \$alpha/alpha_diversity.qza --output-path \${alpha} &> /dev/null

by deleting the output redirection (&> /dev/null). This will allow the qiime2 error message to be printed on the shell.

To avoid having the same message repeated once for each measure, you could even edit lines 854-860 as:

for alpha in ace 
do
  qiime diversity alpha --i-table ${name}_abundance_table.qza --p-metric \$alpha --output-dir \$alpha 
  qiime tools export --input-path \$alpha/alpha_diversity.qza --output-path \${alpha} 
  value=\$(sed -n '2p' \${alpha}/alpha-diversity.tsv | cut -f 2)
  echo -e  \$alpha'\t'\$value 
done >> ${name}_alpha_diversity.tsv  
ParsaGhadermazi commented 3 years ago

I made the changes, and the following was added to the previous error:

Command output:
  Imported test.biom as BIOMV100Format to test_abundance_table.qza

Command error:
  Traceback (most recent call last):
    File "/opt/conda/envs/qiime2-2020.8/bin/qiime", line 11, in <module>
      sys.exit(qiime())
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/click/core.py", line 829, in __call__
      return self.main(*args, **kwargs)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/click/core.py", line 782, in main
      rv = self.invoke(ctx)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/click/core.py", line 1254, in invoke
      cmd_name, cmd, args = self.resolve_command(ctx, args)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/click/core.py", line 1297, in resolve_command
      cmd = self.get_command(ctx, cmd_name)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/commands.py", line 100, in get_command
      plugin = self._plugin_lookup[name]
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/commands.py", line 76, in _plugin_lookup
      import q2cli.core.cache
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/core/cache.py", line 406, in <module>
      CACHE = DeploymentCache()
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/core/cache.py", line 58, in __init__
      self._cache_dir = self._get_cache_dir()
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/core/cache.py", line 85, in _get_cache_dir
      os.makedirs(cache_dir, exist_ok=True)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/os.py", line 210, in makedirs
      makedirs(head, mode, exist_ok)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/os.py", line 210, in makedirs
      makedirs(head, mode, exist_ok)
    File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/os.py", line 220, in makedirs
      mkdir(name, mode)
  PermissionError: [Errno 13] Permission denied: '/home/qiime2'
alesssia commented 3 years ago

It seems you are having some permission issue -- which I agree is weird since only qiime2 seems to trigger them, and only when the alpha-diversity is evaluated (qiime tools import --input-path test.biom --type 'FeatureTable[Frequency]' --input-format BIOMV100Format --output-path test_abundance_table.qza works fine).

What happens if you run, from the YAMP qiime working dir (/projects/$USER/YAMP/work/??/??) and in qiime's singularity image, the following:

alpha=ace
qiime diversity alpha --i-table ${name}_abundance_table.qza --p-metric $alpha --output-dir $alpha

and, if this works, what happens if you run:

qiime tools export --input-path $alpha/alpha_diversity.qza --output-path ${alpha}
ParsaGhadermazi commented 3 years ago

In the first case, I see exactly same error that permission is denied:

Singularity> qiime diversity alpha --i-table ${name}_abundance_table.qza --p-metric $alpha --output-dir $alpha
Traceback (most recent call last):
  File "/opt/conda/envs/qiime2-2020.8/bin/qiime", line 11, in <module>
    sys.exit(qiime())
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/click/core.py", line 1254, in invoke
    cmd_name, cmd, args = self.resolve_command(ctx, args)
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/click/core.py", line 1297, in resolve_command
    cmd = self.get_command(ctx, cmd_name)
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/commands.py", line 100, in get_command
    plugin = self._plugin_lookup[name]
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/commands.py", line 76, in _plugin_lookup
    import q2cli.core.cache
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/core/cache.py", line 406, in <module>
    CACHE = DeploymentCache()
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/core/cache.py", line 58, in __init__
    self._cache_dir = self._get_cache_dir()
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/core/cache.py", line 85, in _get_cache_dir
    os.makedirs(cache_dir, exist_ok=True)
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/os.py", line 210, in makedirs
    makedirs(head, mode, exist_ok)
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/os.py", line 210, in makedirs
    makedirs(head, mode, exist_ok)
  File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/os.py", line 220, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/home/qiime2'

if I enter:

qiime tools export --input-path $alpha/alpha_diversity.qza --output-path ${alpha}

I'd get:

Singularity> qiime tools export --input-path $alpha/alpha_diversity.qza --output-path ${alpha}
Usage: qiime tools export [OPTIONS]

  Exporting extracts (and optionally transforms) data stored inside an
  Artifact or Visualization. Note that Visualizations cannot be transformed
  with --output-format

Options:
  --input-path ARTIFACT/VISUALIZATION
                        Path to file that should be exported        [required]
  --output-path PATH    Path to file or directory where data should be
                        exported to                                 [required]
  --output-format TEXT  Format which the data should be exported as. This
                        option cannot be used with Visualizations
  --help                Show this message and exit.

                    There was a problem with the command:
 (1/1) Invalid value for '--input-path': File 'ace/alpha_diversity.qza' does
  not exist.
alesssia commented 3 years ago

The error is then generated when qiime is trying to write to disk. Yet, the first qiime command, that converts the Metaphlan biom file, writes just fine. This is happening both inside and outside YAMP, so I would say this is not due to YAMP, but a problem of qiiime, singularity and/or your system. I would try posting on the qiime forum for some guidance. May I ask you to post here the solution once you find it? Many thanks!

alesssia commented 3 years ago

[Answering whether this process can be temporarily skipped]

I would change line 848 to

if (( n > -1 )); then

This if already skip the alpha-diversity process when there are less than 3 species, so if you edit as above, it will be skipped regardless of the number of species. It is very rough, but I tested it and it worked. Let me know!

ParsaGhadermazi commented 3 years ago

Hi @alesssia. Thank you very much for the suggestion. This worked for the test dataset. When I move to my own data I receive the following in decontamination step:

Caused by:
  Process `decontaminate (test_Satya)` terminated with an error exit status (1)

Command executed:

  #Sets the maximum memory to the value requested in the config file
  maxmem=$(echo null | sed 's/ //g' | sed 's/B//g')

  bbwrap.sh -Xmx"$maxmem"  mapper=bbmap append=t in1="test_Satya_trimmed_R1.fq.gz","test_Satya_trimmed_singletons.fq.gz" in2="test_Satya_trimmed_R2.fq.gz",null outu="test_Satya_QCd.fq.gz" outm="test_Satya_contamination.fq.gz" minid=0.95 maxindel=3 bwr=0.16 bw=12 minhits=2 qtrim=rl trimq=10 path="./" qin=33 threads=1 untrim quickmatch fast ow &> decontamination_mqc.txt

  # MultiQC doesn't have a module for bbwrap yet. As a consequence, I
  # had to create a YAML file with all the info I need via a bash script
  bash scrape_decontamination_log.sh > decontamination_mqc.yaml

Command exit status:
  1

Command output:
  (empty)

Command error:
  WARNING: Skipping mount /curc/sw/singularity/3.6.4/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container

I First tried different caps for memory usage in decontamination, all of them received this error, I decided to remove the limit totally. Still didn't work. It doesn't look like an OOM problem at this point!

alesssia commented 3 years ago

Yes, this seems to be a container error -- yet the BBmap container was already used for the other QC steps without problems. What happens if you move in the work dir and run instructions in .command.sh line by line (remembering to run bbwrap within the container)?

ParsaGhadermazi commented 3 years ago

I'm new to singularity, so I'm not 100% sure if this is exactly what you have in mind.

singularity shell ../../../depot.galaxyproject.org-singularity-bbmap-38.87--h1296035_0.img

The image starts with this warning which I don't see in at least qiime's image. I don't know how relevant it is:

WARNING: Skipping mount /curc/sw/singularity/3.6.4/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container

Then I run the commands in .command.sh one by one:

maxmem=$(echo 1.5 TB | sed 's/ //g' | sed 's/B//g')

Then

bbwrap.sh -Xmx"$maxmem"  mapper=bbmap append=t in1="test_Satya_trimmed_R1.fq.gz","test_Satya_trimmed_singletons.fq.gz" in2="test_Satya_trimmed_R2.fq.gz",null outu="test_Satya_QCd.fq.gz" outm="test_Satya_contamination.fq.gz" minid=0.95 maxindel=3 bwr=0.16 bw=12 minhits=2 qtrim=rl trimq=10 path="./" qin=33 threads=48 untrim quickmatch fast ow &> decontamination_mqc.txt

These first two steps go without any errors.

I then enter

bash scrape_decontamination_log.sh > decontamination_mqc.yaml

I get this:

bash: scrape_decontamination_log.sh: No such file or directory
alesssia commented 3 years ago

This is normal. scrape_decontamination_log.sh is stored in ./YAMP/bin. While Nextflow knows where to look for it, when you execute .command.sh line by line you need to provide the absolute/relative path.

Regarding the warning, have you tried contacting your IT team? They may be able to help you.

Just out of curiosity, while are you asking for 1.5T of memory? To the best of my experience, I never needed more than 128GB, and this is for the functional characterisation step, which is the most demanding, and only in very rare occasions (>120M reads). All the other steps always required max 32GB.

ParsaGhadermazi commented 3 years ago

Got it. So I did this:

singularity shell --bind <YAMP Directory> depot.galaxyproject.org-singularity-bbmap-38.87--h1296035_0.img

and then within the container the two first steps go well:

maxmem=$(echo 1.5 TB | sed 's/ //g' | sed 's/B//g')

bbwrap.sh -Xmx"$maxmem"  mapper=bbmap append=t in1="test_Satya_trimmed_R1.fq.gz","test_Satya_trimmed_singletons.fq.gz" in2="test_Satya_trimmed_R2.fq.gz",null outu="test_Satya_QCd.fq.gz" outm="test_Satya_contamination.fq.gz" minid=0.95 maxindel=3 bwr=0.16 bw=12 minhits=2 qtrim=rl trimq=10 path="./" qin=33 threads=48 untrim quickmatch fast ow &> decontamination_mqc.txt

The third step:

../../../bin/scrape_decontamination_log.sh > decontamination_mqc.yaml

This throws the following error :

/curc/sw/lmod/lmod/init/bash: line 82: /bin/tr: No such file or directory
gunzip: *_QCd.fq.gz: No such file or directory
gunzip: *_contamination.fq.gz: No such file or directory
awk: cmd. line:1: Division by zero

Regarding the memory limit, I have two big paired end files, and the first time on base config I received a memory OOM error, so I increased the limits to a very large number, I have 2 TB, so that I'm not concerned about that. I think I played too safe :)

That itself seems to be making this issue! I opened the decontamination_mqc.txt file and it had the following in it:

Invalid maximum heap size: -Xmx1.5T
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

I decreased it to 20GB, and it ran okay! Thanks a lot for your help!

alesssia commented 3 years ago

Great! Can I close the issue?

ParsaGhadermazi commented 3 years ago

Yes, Thanks!

bmichanderson commented 2 years ago

The error is then generated when qiime is trying to write to disk. Yet, the first qiime command, that converts the Metaphlan biom file, writes just fine. This is happening both inside and outside YAMP, so I would say this is not due to YAMP, but a problem of qiiime, singularity and/or your system. I would try posting on the qiime forum for some guidance. May I ask you to post here the solution once you find it? Many thanks!

Hi, I know this is a closed issue, but I thought I'd add the solution that seemed to work for me. I think there can be potential issues with writing to user home directories on HPC with Singularity. I bound home to the current working directory and there were no longer issues. To do this, add this line in the singularity profile of your nextflow.config file:

singularity.runOptions = "-H $PWD"