Closed bioinfogaby closed 1 year ago
Thanks for the report! As it happens, just now I tested that because I also encountered that error. I found the following:
#works:
NXF_VER=21.10.3 nextflow run nf-core/ampliseq -r 2.3.0 -profile test,singularity --outdir results_test_2-3-0 -resume
NXF_VER=23.04.0 nextflow run nf-core/ampliseq -r 2.7.0 -profile test,singularity --outdir results_test_2-7-0 -resume
NXF_VER=23.04.4 nextflow run nf-core/ampliseq -r 2.7.0 -profile test,singularity --outdir results_test_2-7-0 -resume
#fails:
NXF_VER=23.10.0 nextflow run nf-core/ampliseq -r 2.7.0 -profile test,singularity --outdir results_test_2-7-0 -resume
Conclusion:
nextflow version 23.10.0 is somehow incompatible. Could you use one of the versions listed above and report back if that solves your issue, e.g. prepend to your command when starting the pipeline NXF_VER=23.04.4
.
Edit: The error I encountered was related to Process NFCORE_AMPLISEQ:AMPLISEQ:QIIME2_PREPTAX:QIIME2_EXTRACT (GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT)
(but also RuntimeError: cannot cache function 'rdist': no locator available for file '/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/umap/layouts.py'
), so I conclude the problem is QIIME2 again [this just makes trouble from times to times...].
Yeap, that worked. Thanks!
Nextflow 23.10 adds the --no-home
option when using Singularity. Maybe this tool wanted to cache data under the home directory ?
Hm quite possible. How to test that?
Correct?
Nextflow 23.10 adds the
--no-home
option when using Singularity. Maybe this tool wanted to cache data under the home directory ?
That is exactly what the error message states, no?
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-38xyr0wz
because the default path (/home/qiime2/matplotlib) is not a writable directory;
it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory,
in particular to speed up the import of Matplotlib and to better support multiprocessing.
In previous versions, /home
was mounted inside the container and writeable, so /home/qiime2/matplotlib
could be used for caching.
I just wanted to confirm the problem on my laptop (to later run with docker; I have installed singularity & docker) with
NXF_VER=23.10.0 nextflow run nf-core/ampliseq -r 2.7.0 -profile test,singularity --outdir results_test_2-7-0
and it succeeded. So it seems to differ between systems.
So I tried all 3 systems that I have available at the moment:
Failing system (Workstation)
Succeeding system (Laptop):
Succeeding system (hpc):
That is exactly what the error message states, no?
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-38xyr0wz because the default path (/home/qiime2/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
In previous versions,
/home
was mounted inside the container and writeable, so/home/qiime2/matplotlib
could be used for caching.
Great. In that case, I would imagine two workarounds:
export MPLCONFIGDIR=$PWD
to keep the cache local to the work directory, which is already mounted read+writeexport MPLCONFIGDIR
(and default to $PWD
)Thanks @muffato & @MatthiasZepper for your intrest and suggestions!
I added export MPLCONFIGDIR="${PWD}/HOME"
to all processes that use QIIME2, and indeed the part with problematic MPLCONFIGDIR is solved, however the process still fails. Matplotlib wasnt the root of the problem it seems to me.
Here is the complete error message for NXF_VER=23.10.0 nextflow run d4straub/ampliseq -r fix-NXF_VER=23.10.0 -profile test,singularity --outdir results
:
ERROR ~ Error executing process > 'NFCORE_AMPLISEQ:AMPLISEQ:QIIME2_PREPTAX:QIIME2_EXTRACT (GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT)'
Caused by:
Process `NFCORE_AMPLISEQ:AMPLISEQ:QIIME2_PREPTAX:QIIME2_EXTRACT (GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT)` terminated with an error exit status (1)
Command executed:
export XDG_CONFIG_HOME="${PWD}/HOME"
export MPLCONFIGDIR="${PWD}/HOME"
### Import
qiime tools import \
--type 'FeatureData[Sequence]' \
--input-path greengenes85.fna \
--output-path ref-seq.qza
qiime tools import \
--type 'FeatureData[Taxonomy]' \
--input-format HeaderlessTSVTaxonomyFormat \
--input-path greengenes85.tax \
--output-path ref-taxonomy.qza
#Extract sequences based on primers
qiime feature-classifier extract-reads \
--i-sequences ref-seq.qza \
--p-f-primer GTGYCAGCMGCCGCGGTAA \
--p-r-primer GGACTACNVGGGTWTCTAAT \
--o-reads GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-ref-seq.qza \
--quiet
cat <<-END_VERSIONS > versions.yml
"NFCORE_AMPLISEQ:AMPLISEQ:QIIME2_PREPTAX:QIIME2_EXTRACT":
qiime2: $( qiime --version | sed '1!d;s/.* //' )
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
Traceback (most recent call last):
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/q2cli/builtin/tools.py", line 266, in import_data
artifact = qiime2.sdk.Artifact.import_data(type, input_path,
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/sdk/result.py", line 299, in import_data
pm = qiime2.sdk.PluginManager()
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/sdk/plugin_manager.py", line 67, in __new__
self._init(add_plugins=add_plugins)
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/sdk/plugin_manager.py", line 105, in _init
plugin = entry_point.load()
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/pkg_resources/__init__.py", line 2518, in load
return self.resolve()
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/pkg_resources/__init__.py", line 2524, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/q2_diversity/__init__.py", line 11, in <module>
from ._beta import (beta, beta_phylogenetic, bioenv,
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/q2_diversity/_beta/__init__.py", line 13, in <module>
from ._beta_rarefaction import beta_rarefaction
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/q2_diversity/_beta/_beta_rarefaction.py", line 23, in <module>
from .._ordination import pcoa
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/q2_diversity/_ordination.py", line 20, in <module>
import umap as up
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/umap/__init__.py", line 2, in <module>
from .umap_ import UMAP
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/umap/umap_.py", line 41, in <module>
from umap.layouts import (
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/umap/layouts.py", line 40, in <module>
def rdist(x, y):
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/numba/core/decorators.py", line 234, in wrapper
disp.enable_caching()
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/numba/core/dispatcher.py", line 863, in enable_caching
self._cache = FunctionCache(self.py_func)
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/numba/core/caching.py", line 601, in __init__
self._impl = self._impl_class(py_func)
File "/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/numba/core/caching.py", line 337, in __init__
raise RuntimeError("cannot cache function %r: no locator available "
RuntimeError: cannot cache function 'rdist': no locator available for file '/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/umap/layouts.py'
An unexpected error has occurred:
cannot cache function 'rdist': no locator available for file '/opt/conda/envs/qiime2-2023.7/lib/python3.8/site-packages/umap/layouts.py'
See above for debug info.
Work dir:
/home/bcgsd01/test_ampliseq/work/08/8a446d3afbec175518f6be0768a7d6
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
-- Check '.nextflow.log' file for details
I think /tmp
is the better choice nonetheless. What happens with this setting?
export MPLCONFIGDIR="/tmp/mplconfigdir"
export NUMBA_CACHE_DIR="/tmp/numbacache"
Great, thanks, that did work indeed!
Why would /tmp
be better? Is that common practice in nf-core (I am not aware of an example, do you)? I just worry a bit about clouds (cannot test that, except AWS test).
I checked for examples in pipelines and nf-core/modules and found only very few examples:
nf-core/modules:
nf-core/marsseq:
nf-core/rnafusion
So nf-core modules seem to put tmp and home into the work dir .
, while the two local modules in pipelines use /tmp
. This pipeline uses export XDG_CONFIG_HOME="\${PWD}/HOME"
, e.g. here.
Not sure what to conclude here, so different solutions :)
Another example for your list, @d4straub : ./tmp
is this module. The reason for choosing ./tmp
instead of letting the tool use /tmp
is that when the job is killed on a HPC, it would then leave some leftover files on /tmp
which take space and may get in the way of other users / processes. By keeping the files local to ./tmp
, it's fully within the user's own work directory, which the user can easily clean up.
In your case, and I know nothing about the tool so I'm just guessing, if it's a "cache", then presumably the tool will not clean it up at the end, since the purpose of a cache is to keep files around for the next run. Using /tmp
and running the pipeline on a HPC would mean that likely, every run would hit a different compute node and the cluster may accumulate caches on all compute nodes over time ! That is not the purpose of /tmp
:)
So either consider it a purely temporary necessity and use ./tmp
, or make it a proper pipeline parameter that the user can set to wherever they want and reuse between runs. (by the way, would it even make sense to share the cache between pipeline runs ?)
Thanks for that great explanation! I'll test ./tmp
to make sure it'll work fine.
I researched where the older export code is coming from in ampliseq and I found the addition of it in https://github.com/nf-core/ampliseq/pull/163. In that PR was a change from export HOME=./HOME
to export HOME="\${PWD}/HOME"
here, because @skrakau suggested "Better not use relative paths". Since there are some examples now with relative paths and I found none with absolute paths, I am preferring the relative paths. Or are you aware of any problems regarding relative paths Sabrina?
I think the nf-core #modules channel would be the appropriate place to get some more input on the issue.
I am certainly not an expert in this matter, but @muffato 's explanation strikes me as strange. /tmp
is the default path within Linux file system structure for temporary files. Its sole purpose is to offer applications a dedicated place to store temporary files. These files are generally deleted whenever the system is restarted and may be deleted at any time by utilities such as tmpwatch. On HPC systems /tmp
is usually configured to mount a specific scratch file system that is better suited than the ordinary distributed file systems for quick caching & writes and cleaned-up when needed.
The reason why you do not find many modules or pipelines with explicit configuration is, that many tools just use /tmp
for caching files and the respective container technologies mount the hosts /tmp
there. All the respective config happens on the profile level and Nextflow has corresponding config options for the different executors (e.g. for AWS) and container technologies.
In summary, I advocate using /tmp
, since it is the path that is specifically meant for this purpose and also configured accordingly on the different executors.
Thanks a lot for your input!
I did get the feedback however that /tmp
is not a good choice for example on our hpc because it is using a scratch system that is not necessarily connected to /tmp
. Therefore I was advised to use ./tmp
or similar (which will then use the scratch system as intended). I did check the size of those folder that I want to redirect and its just a couple of MB at most. So I'm going to use ./<folder>
for now.
Fair comment @MatthiasZepper . You're absolutely right regarding what is best practice on a HPC and what you're describing is exactly how things work on ours. One of the problems we're seeing is that tmpwatch is not running soon enough and we're often running out of space on /tmp
(which causes us other problems like nodes having less RAM, etc). But I think it's reasonable to say it's our issue and I probably over-reached in my previous comment.
I was recently wondering if I could force something like export TMPDIR=$PWD/tmp
at the beginning of each Nextflow job, rather than having to change every module, but I didn't know how. Thanks to your links, I've found process.beforeScript
which seems to do exactly what I was looking for.
Coming back to the original error, we're looking for a place to let the tool record some cache files that it won't delete at the end. My view is still that either it is treated as a proper cache and is made a proper Nextflow parameter that the user can select, will be staged, and can be reused between processes / runs. Or, consider it like temporary files that can be trashed after the run. In the latter case, I wouldn't support making it use tmp
(or $TMPDIR
) without directly deleting the files after. I think the module should clean after itself. And actually, it should probably do that even if you end up using a local directory under the job directory.
Thanks to all of you here, the fix is in dev branch now, will be in the next release. I close here but feel free to open another issue if you encounter any other problems.
Description of the bug
FYI, ASV_seqs.fasta is present in the work directory (b12ddbfb05b35bef6eb415cb9f1ef0). I've no clue on the origin of the error.
Command used and terminal output
Relevant files
No response
System information
Nextflow version: 23.10.0 build 5889 Hardware: Desktop Executor: local Container engine: Singularity OS: Linux Version of nf-core/ampliseq: 2.7.0