Closed suegrimes closed 11 months ago
Hi @suegrimes
This error appears to be caused by some third party code (numba) being unable to access a cache directory. We are looking into the problem.
In the meantime, would you be able to run the workflow again in case this was cased by a temporary network problem?
Hi Neil,
Thanks for following up. I tried running again and get the same error messages,
Sue.
From: Neil Horner @.> Sent: Wednesday, September 20, 2023 2:44 AM To: epi2me-labs/wf-single-cell @.> Cc: Sue M Grimes @.>; Mention @.> Subject: Re: [epi2me-labs/wf-single-cell] ERROR: (numba) unable to cache 'rdist' (Issue #47)
Hi @suegrimeshttps://github.com/suegrimes
This error appears to be caused by some third party code (numba) being unable to access a cache directory. We are looking into the problem.
In the meantime, would you be able to run the workflow again in case this was cased by a temporary network problem?
— Reply to this email directly, view it on GitHubhttps://github.com/epi2me-labs/wf-single-cell/issues/47#issuecomment-1727361807, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AADNHNJAQEFTTU3256UYRBLX3K3FDANCNFSM6AAAAAA46XSKOY. You are receiving this because you were mentioned.Message ID: @.***>
Hi Sue,
Thanks for conforming that. I will be looking into this shortly.
Neil
Hi both
I seem to have the same problem with 'rdist':
Caused by:
Process `pipeline:process_bams:stringtie (2)` terminated with an error exit status (1)
Command executed:
# Data from 3prime and multiome kits must be flipped to the transcript strand before building transcriptome.
workflow-glue process_bam_for_stringtie align.bam 10 | tee >(stringtie -L -c 2 -p 4 -G chr.gtf -l stringtie -o stringtie.gff - ) | samtools fastq > reads.fastq
# Get transcriptome sequence
gffread -g ref_genome.fa -w transcriptome.fa stringtie.gff
Command exit status:
1
Command output:
(empty)
Command error:
INFO: Converting SIF file to temporary sandbox...
WARNING: underlay of /etc/localtime required more than 50 (78) bind mounts
/home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
@numba.jit()
/home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1071: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
@numba.jit()
/home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1086: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
@numba.jit()
Traceback (most recent call last):
File "/home/projects/cu_10027/data/projects/gbm/data/data_processed/rigshospitalet/rna/ont/wf-single-cell/bin/workflow-glue", line 7, in <module>
cli()
File "/home/projects/cu_10027/data/projects/gbm/data/data_processed/rigshospitalet/rna/ont/wf-single-cell/bin/workflow_glue/__init__.py", line 61, in cli
f'{_package_name}.{comp}' for comp in get_components()]
File "/home/projects/cu_10027/data/projects/gbm/data/data_processed/rigshospitalet/rna/ont/wf-single-cell/bin/workflow_glue/__init__.py", line 26, in get_components
mod = importlib.import_module(f"{_package_name}.{name}")
File "/home/epi2melabs/conda/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/projects/cu_10027/data/projects/gbm/data/data_processed/rigshospitalet/rna/ont/wf-single-cell/bin/workflow_glue/umap_reduce.py", line 7, in <module>
import umap
File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/__init__.py", line 2, in <module>
from .umap_ import UMAP
File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/umap_.py", line 41, in <module>
from umap.layouts import (
File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py", line 40, in <module>
def rdist(x, y):
File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/decorators.py", line 234, in wrapper
disp.enable_caching()
File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/dispatcher.py", line 863, in enable_caching
self._cache = FunctionCache(self.py_func)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/caching.py", line 601, in __init__
self._impl = self._impl_class(py_func)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/caching.py", line 337, in __init__
raise RuntimeError("cannot cache function %r: no locator available "
RuntimeError: cannot cache function 'rdist': no locator available for file '/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py'
WARNING: no reference transcripts were found for the genomic sequences where reads were mapped!
Please make sure the -G annotation file uses the same naming convention for the genome sequences.
Failed to read header for "-"
INFO: Cleaning up image...
Josephine
Hi @Josephinedh
Thanks for additionally reporting this error. I will look into this tomorrow and get back to you.
@Josephinedh
I think the UMAP/numba stuff in the logs might be a red herring. This bit at at the bottom of the log is probably what we're interested in
WARNING: no reference transcripts were found for the genomic sequences where reads were mapped!
Please make sure the -G annotation file uses the same naming convention for the genome sequences.
Failed to read header for "-"
Could you check that the naming convention for the annotation file and the genome file is the same please. If it is and you are still hitting problems, please raise another ticket.
Thanks
@suegrimes
In the case of @Josephinedh I think the numba errors in the log were probably not the case of the crash as some error messages relating to the process in question were also there.
I your case, I don't see any error message from the adapter_scan process where the workflow exited.
Would you be able to post the output of ls -lhL /mnt/ix2/Sandbox/sgrimes/20230911_SG_wf_scell/A01_demo/output/workspace/ac/8f427baf6b89bfaa494e3b08cfcbe2
please?
Hi Neil,
Here’s the output you requested, ..which doesn’t seem to say much.
ls -lhL /mnt/ix2/Sandbox/sgrimes/20230911_SG_wf_scell/A01_demo/output/workspace/ac/8f427baf6b89bfaa494e3b08cfcbe2
total 126M -rw-r--r-- 1 sgrimes root 126M Sep 18 21:25 chunk.fq.gz
Sue.
From: Neil Horner @.> Sent: Wednesday, September 27, 2023 8:33 AM To: epi2me-labs/wf-single-cell @.> Cc: Sue M Grimes @.>; Mention @.> Subject: Re: [epi2me-labs/wf-single-cell] ERROR: (numba) unable to cache 'rdist' (Issue #47)
@suegrimeshttps://github.com/suegrimes
In the case of @Josephinedhhttps://github.com/Josephinedh I think the numba errors in the log were probably not the case of the crash as some error messages relating to the process in question were also there.
I your case, I don't see any error message from the adapter_scan process where the workflow exited. Would you be able to post the output of ls -lhL /mnt/ix2/Sandbox/sgrimes/20230911_SG_wf_scell/A01_demo/output/workspace/ac/8f427baf6b89bfaa494e3b08cfcbe2 please?
— Reply to this email directly, view it on GitHubhttps://github.com/epi2me-labs/wf-single-cell/issues/47#issuecomment-1737636642, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AADNHNO3OUPDWVHJUYIF3KLX4RBL7ANCNFSM6AAAAAA46XSKOY. You are receiving this because you were mentioned.Message ID: @.***>
Hi @suegrimes
I'm not sure what's happening here. Do you have to ability to run with docker? -profile docker
?
Neil
Hi Neil,
We actually can’t use docker (Stanford doesn’t allow due to potential security issues). It is possible that there was some issue with singularity install I suppose – our IT support person installed it on the server specifically for this use, so it is my first time using it. I noticed there is a -profile local, and I did try that initially, don’t remember what the issue I had with that was. It is possible to use -profile local? What would be required for that to work? I presume ensuring some specific dependencies are installed?
Sue.
From: Neil Horner @.> Sent: Friday, September 29, 2023 8:13 AM To: epi2me-labs/wf-single-cell @.> Cc: Sue M Grimes @.>; Mention @.> Subject: Re: [epi2me-labs/wf-single-cell] ERROR: (numba) unable to cache 'rdist' (Issue #47)
Hi @suegrimeshttps://github.com/suegrimes
I'm not sure what's happening here. Do you have to ability to run with docker? -profile docker?
Neil
— Reply to this email directly, view it on GitHubhttps://github.com/epi2me-labs/wf-single-cell/issues/47#issuecomment-1741050178, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AADNHNNXKXPXS334XKYE7ZDX43QRBANCNFSM6AAAAAA46XSKOY. You are receiving this because you were mentioned.Message ID: @.**@.>>
Hi Neil,
As I mentioned I actually don’t have the ability to run using docker, and not really sure how to troubleshoot if this is potentially a singularity issue. FYI the singularity version that is installed is 2.6.1 – in case this is a version issue? Wondering if it is possible to just setup a conda environment and run there? I did briefly try, and installed fastcat, seqkit, a couple of python packages and gffread. But looks like there are more things to install, and a bit tedious to do them one by one as error messages come up. Do you have a list of software that needs to be available/installed for this pipeline to run successfully?
Sue.
From: Neil Horner @.> Sent: Friday, September 29, 2023 8:13 AM To: epi2me-labs/wf-single-cell @.> Cc: Sue M Grimes @.>; Mention @.> Subject: Re: [epi2me-labs/wf-single-cell] ERROR: (numba) unable to cache 'rdist' (Issue #47)
Hi @suegrimeshttps://github.com/suegrimes
I'm not sure what's happening here. Do you have to ability to run with docker? -profile docker?
Neil
— Reply to this email directly, view it on GitHubhttps://github.com/epi2me-labs/wf-single-cell/issues/47#issuecomment-1741050178, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AADNHNNXKXPXS334XKYE7ZDX43QRBANCNFSM6AAAAAA46XSKOY. You are receiving this because you were mentioned.Message ID: @.**@.>>
I got the same rdist error when running the pipeline using singularity (3.7.1) on our school's HPC, for the same reasons we are not allowed to use Docker. I also turned to use the local profile for now... would be nice to have a file like environment.yml to help set up a conda environment, thanks!
Hi @chilampoon
Could you post your .nextflow.log please? Using the local profile, you do not encounter this problem?
hi @nrhorner the message is like
[21:47:48 - matplotlib] Matplotlib created a temporary cache directory at /tmp/matplotlib-mx2dhpr8 because the default path (/home/poonc2/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Fontconfig error: No writable cache directories
[21:47:49 - matplotlib.font_manager] generated new fontManager
/home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
@numba.jit()
/home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1071: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
@numba.jit()
/home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1086: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
@numba.jit()
Traceback (most recent call last):
File "/home/poonc2/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow-glue", line 7, in <module>
cli()
File "/home/poonc2/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/__init__.py", line 61, in cli
f'{_package_name}.{comp}' for comp in get_components()]
File "/home/poonc2/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/__init__.py", line 26, in get_components
mod = importlib.import_module(f"{_package_name}.{name}")
File "/home/epi2melabs/conda/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/poonc2/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/umap_reduce.py", line 7, in <module>
import umap
File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/__init__.py", line 2, in <module>
from .umap_ import UMAP
File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/umap_.py", line 41, in <module>
from umap.layouts import (
File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py", line 40, in <module>
def rdist(x, y):
File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/decorators.py", line 234, in wrapper
disp.enable_caching()
File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/dispatcher.py", line 863, in enable_caching
self._cache = FunctionCache(self.py_func)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/caching.py", line 601, in __init__
self._impl = self._impl_class(py_func)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/caching.py", line 337, in __init__
raise RuntimeError("cannot cache function %r: no locator available "
RuntimeError: cannot cache function 'rdist': no locator available for file '/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py'
It changed a little bit after I changed something but the rdist
or cache issue is always there. I changed the cache dir $PATH for numba, matplotlib, etc. but it didn't solve the problem (maybe it needs to be specified in the container environment but I am not sure)
The local environment works and the pipeline is still running, one thing I noticed is that for python libraries like matplotlib, biopython and such I also need to install them via conda not pip, maybe it's because I also need to specify LIB PATH for nextflow to access them..
@chilampoon, @Josephinedh, @suegrimes
Sorry for taking so long to have another look at this. This does seem to be a problem with singularity and numba (which is used by the umap library). It seems that numba cannot access the default directories it uses for caching. See https://numba.pydata.org/numba-doc/dev/reference/envvars.html#envvar-NUMBA_CACHE_DIR for the locations of these.
Could you try setting the numba cache directory to a known writable location (perhaps /tmp)
You will need to create a config file, lets call it nextflow_env.config
with the following entry:
env {
NUMBA_CACHE_DIR='/tmp'
}
Or add NUMBA_CACHE_DIR='/tmp'
to an existing env
scope block in your current config.
Point the workflow to the config with -c nextflow_env.config
thanks @nrhorner, sure I could try. Also the running time is very long (I am using a local environment), my smallest fastq file is ~60GB is in cluster umi step but it's been running for around 3 or 4 days - I wonder if using a container helps speed up..
@chilampoon The use of the container vs local profile should not affect execution speed. What resources are you giving the workflow?
I requested 8 CPUs on HPC, and I forgot to set the number of CPUs in each subworkflow in the beginning, I then changed all time-consuming steps to use 8 CPUs. I set the max num of threads to 14 as well as for minimap2. However the whole pipeline is still very slow.
It seems like fastscan, alignment, and cluster_umi and maybe some others take a very long time. The requested time of my job is 100 hours then after that the workflow got killed, most samples were in cluster_umis or one or two steps above; and then if I resume, they all started from fastscan again... The pipeline is driving me crazy... is it possible to start from aligned bam file but not fastq? I'll give a try to the old sockeye snakemake workflow as I remembered when I used it last year it was not that long
Hi @chilampoon Is this original issue related to this ticket now fixed, you are no longer getting the nubba cache errror?
If so could you open a new ticket detailing your issues relating to the speed of the workflow. Please include your log and your config file used.
It's currently not possible to start with an aligned BAM input, but we could consider adding that feature.
Closing due to lack of response regarding the original issue, which I assume the workaround has fixed.
Just for reference. I ran into the same issue with the workflow (using singularity on our hpc) and the workaround (setting the NUMBA_CACHE_DIR) solved it.
Just for reference. I ran into the same issue with the workflow (using singularity on our hpc) and the workaround (setting the NUMBA_CACHE_DIR) solved it.
That's good to know, thanks @AshKernow
I just ran into this same issue.
@nrhorner a simple, permanent fix is to update the nextflow.config file to:
env {
PYTHONNOUSERSITE = 1
NUMBA_CACHE_DIR = "${baseDir}/numba_cache"
}
Operating System
Other Linux (please specify below)
Other Linux
Ubuntu 20.04
Workflow Version
v0.2.7-g9272e2c
Workflow Execution
Command line
EPI2ME Version
No response
CLI command run
nextflow run epi2me-labs/wf-single-cell -w output/workspace -profile singularity \ --fastq wf-single-cell-demo/chr17.fq.gz \ --kit_name 3prime \ --kit_version v3 \ --expected_cells 100 \ --ref_genome_dir wf-single-cell-demo/ \ --out_dir output \ --plot_umaps
Workflow Execution - CLI Execution Profile
singularity
What happened?
Pipeline failed with error, no output
Relevant log output
Application activity log entry
No response