epi2me-labs / wf-single-cell

Other
74 stars 39 forks source link

ERROR ~ Error executing process > 'fastcat (1)' Caused by: Process `fastcat (1)` terminated with an error exit status (2) #58

Closed lscdanson closed 6 months ago

lscdanson commented 1 year ago

Ask away!

I was running the pipeline as a job (with Slurm) on my institution's clusters with the following input: `#!/bin/sh

SBATCH --time=2-00:00:00

SBATCH --ntasks=32

SBATCH --mem=400G

SBATCH --partition=long

nextflow run epi2me-labs/wf-single-cell \ -profile singularity \ -c /path/to/my_config.cfg \ --max_threads 32 \ --fastq /path/to/sample.fastq.gz \ --kit_name 3prime \ --kit_version v3 \ --ref_genome_dir /path/to/refdata-gex-GRCh38-2020-A \ --out_dir /path/to/output/dir/ \ --plot_umaps \ --merge_bam`

And the pipeline got terminated with the following error: `executor > local (6) [ef/e3e72e] process > fastcat (1) [100%] 1 of 1, failed: 1 ✘ [ac/4dc902] process > pipeline:getVersions [100%] 1 of 1 ✔ [84/47ef1e] process > pipeline:getParams [100%] 1 of 1 ✔ [- ] process > pipeline:summariseCatChunkR... - [- ] process > pipeline:stranding:callada... - [- ] process > pipeline:stranding:combine... - [- ] process > pipeline:stranding:summariz... - [90/bd602e] process > pipeline:align:call_paftools [100%] 1 of 1 ✔ [c6/52178e] process > pipeline:align:get_chrom_sizes [100%] 1 of 1 ✔ [- ] process > pipeline:align:align_to_ref - [37/65e989] process > pipeline:process_bams:split... [100%] 1 of 1 ✔ [- ] process > pipeline:process_bams:get_c... - [- ] process > pipeline:process_bams:extra... - [- ] process > pipeline:process_bams:combi... - [- ] process > pipeline:process_bams:gener... - [- ] process > pipeline:process_bams:assig... - [- ] process > pipeline:process_bams:strin... - [- ] process > pipeline:process_bams:align... - [- ] process > pipeline:process_bams:assig... - [- ] process > pipeline:process_bams:clust... - [- ] process > pipeline:process_bams:tag_bams - [- ] process > pipeline:process_bams:combi... - [- ] process > pipeline:process_bams:combi... - [- ] process > pipeline:process_bams:umi_g... - [- ] process > pipeline:process_bams:const... - [- ] process > pipeline:process_bams:proce... - [- ] process > pipeline:processbams:umap... - [- ] process > pipeline:process_bams:combi... - [- ] process > pipeline:processbams:pack... - [- ] process > pipeline:prepare_report_data - [- ] process > pipeline:makeReport - [- ] process > output - [- ] process > output_report - ERROR ~ Error executing process > 'fastcat (1)'

Caused by: Process fastcat (1) terminated with an error exit status (2)

Command executed:

mkdir fastcat_stats fastcat -s sample2_unstim_02July_BioRep2 -r >(bgzip -c > fastcat_stats/per-read-stats.tsv.gz) -f fastcat_stats/per-file-stats.tsv input | bgzip > seqs.fastq.gz

extract the run IDs from the per-read stats csvtk cut -tf runid fastcat_stats/per-read-stats.tsv.gz | csvtk del-header | sort | uniq > fastcat_stats/run_ids

Command exit status: 2

Command output: (empty)

Command error: INFO: Converting SIF file to temporary sandbox... WARNING: underlay of /etc/localtime required more than 50 (78) bind mounts sort: cannot create temporary file in '/var/scratch/dloi/543641/': No such file or directory INFO: Cleaning up image... `

I've tried creating a new environment with only nextflow and singularity installed and updated to their latest versions but it still didn't work. Could you advise?

lscdanson commented 1 year ago

Please also find the log file here: nextflow.log

nrhorner commented 1 year ago

Hi @lscdanson This seems to be related to this issue: https://github.com/nf-core/chipseq/issues/123.

Could you try exporting the temp directory path before running nextflow. Here I've used /tmp/ but it can be any writable folder.

export TMPDIR=/tmp/
nextflow run ...
lscdanson commented 1 year ago

Hi @nrhorner thanks a lot for your advice. I've incorporated your advice and got passed the fastcat step but i've got another error while running the stranding:call_adaptor step:

`executor > local (24) [61/2f4466] process > fastcat (1) [100%] 1 of 1 ✔ [45/fb03dd] process > pipeline:getVersions [100%] 1 of 1 ✔ [7c/ecf13a] process > pipeline:getParams [100%] 1 of 1 ✔ [bc/5be4f3] process > pipeline:summariseCatChunkR... [100%] 1 of 1 ✔ [e2/74a8c6] process > pipeline:stranding:callada... [ 3%] 1 of 26, failed: 1 [- ] process > pipeline:stranding:combine... - [- ] process > pipeline:stranding:summariz... - [38/37959f] process > pipeline:align:call_paftools [100%] 1 of 1 ✔ [e9/aceb45] process > pipeline:align:get_chrom_sizes [100%] 1 of 1 ✔ [- ] process > pipeline:align:align_to_ref - [a0/6de6de] process > pipeline:process_bams:split... [100%] 1 of 1 ✔ [- ] process > pipeline:process_bams:get_c... - [- ] process > pipeline:process_bams:extra... - [- ] process > pipeline:process_bams:combi... - [- ] process > pipeline:process_bams:gener... - [- ] process > pipeline:process_bams:assig... - [- ] process > pipeline:process_bams:strin... - [- ] process > pipeline:process_bams:align... - [- ] process > pipeline:process_bams:assig... - [- ] process > pipeline:process_bams:clust... - [- ] process > pipeline:process_bams:tag_bams - [- ] process > pipeline:process_bams:combi... - [- ] process > pipeline:process_bams:combi... - [- ] process > pipeline:process_bams:umi_g... - [- ] process > pipeline:process_bams:const... - [- ] process > pipeline:process_bams:proce... - [- ] process > pipeline:processbams:umap... - [- ] process > pipeline:process_bams:combi... - [- ] process > pipeline:processbams:pack... - [- ] process > pipeline:prepare_report_data - [- ] process > pipeline:makeReport - [- ] process > output - [- ] process > output_report - ERROR ~ Error executing process > 'pipeline:stranding:call_adapter_scan (10)'

Caused by: Process pipeline:stranding:call_adapter_scan (10) terminated with an error exit status (1)

Command executed:

export POLARS_MAX_THREADS=2

workflow-glue adapter_scan_vsearch chunk.fq.gz --kit 3prime --output_fastq "sample2_unstim_02July_BioRep2_adapt_scan.fastq.gz" --output_tsv "sample2_unstim_02July_BioRep2_adapt_scan.tsv"

Command exit status: 1

Command output: (empty)

Command error: INFO: Converting SIF file to temporary sandbox... WARNING: underlay of /etc/localtime required more than 50 (78) bind mounts [16:45:56 - matplotlib] Matplotlib created a temporary cache directory at /tmp/matplotlib-fzxns3zl because the default path (/home/d/dloi/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing. Fontconfig error: No writable cache directories [16:45:56 - matplotlib.font_manager] generated new fontManager /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details. @numba.jit() /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1071: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details. @numba.jit() /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1086: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details. @numba.jit() Traceback (most recent call last): File "/home/d/dloi/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow-glue", line 7, in cli() File "/home/d/dloi/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/init.py", line 61, in cli f'{_package_name}.{comp}' for comp in get_components()] File "/home/d/dloi/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/init.py", line 26, in get_components mod = importlib.import_module(f"{_package_name}.{name}") File "/home/epi2melabs/conda/lib/python3.8/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1014, in _gcd_import File "", line 991, in _find_and_load File "", line 975, in _find_and_load_unlocked File "", line 671, in _load_unlocked File "", line 843, in exec_module File "", line 219, in _call_with_frames_removed File "/home/d/dloi/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/umapreduce.py", line 7, in import umap File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/init.py", line 2, in from .umap import UMAP File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/umap_.py", line 41, in from umap.layouts import ( File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py", line 40, in def rdist(x, y): File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/decorators.py", line 234, in wrapper disp.enable_caching() File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/dispatcher.py", line 863, in enable_caching self._cache = FunctionCache(self.py_func) File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/caching.py", line 601, in init self._impl = self._impl_class(py_func) File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/caching.py", line 337, in init raise RuntimeError("cannot cache function %r: no locator available " RuntimeError: cannot cache function 'rdist': no locator available for file '/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py' INFO: Cleaning up image... `

Would this be due to any package version issue?

lscdanson commented 1 year ago

nextflow_(2).log

Please find the log file here. Thanks!

nrhorner commented 1 year ago

Hi @lscdanson This is another issue related to singularity, which keeps cropping up. I will add it to the docs. For now please see https://github.com/epi2me-labs/wf-single-cell/issues/47#issuecomment-1775405339 for a solution

lscdanson commented 1 year ago

Hi @nrhorner, I followed your instruction and added the numba cache dir environment variable to a config file but it still returned the same error. Any reason that it can't detect the changed directory?

Please find below the content of my config file in case I made any mistake:

`executor { $local { cpus = 64 memory = "400 GB" } }

env { NUMBA_CACHE_DIR='absolute/path/to/tmp_numba' } `

lscdanson commented 1 year ago

nextflow_3.log

Please also find the log file here. Thanks.

nrhorner commented 1 year ago

Hi @lscdanson

Did you replace absolute/path/to/tmp_numba with an an actual path that exists?

nrhorner commented 1 year ago

I'm guessing you did. But, it looks like the NUMBA_CACHE_DIR environment variable is not being set.

 RuntimeError: cannot cache function 'rdist': no locator available for file '/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py'

Numba is still trying to create the cache in the same directory as the umap code. Can you confirm that /ceph/project/cribbslab/shared/proj005/analyses/10X_ehrlam/my_config.cfg contains the env block, and could you try adding the path you used for --out_dir (/ceph/project/cribbslab/shared/proj005/analyses/10X_ONT_PBMC_stimVSunstim/epi2me/2un2/) as the NUMBA_CACHE_DIR please

lscdanson commented 1 year ago

Hi @nrhorner, I tried adding my output directory as the NUMBA_CACHE_DIR and resumed the nextflow pipeline using the -resume tag but it quickly got aborted returning the same error.

My config file: `executor { $local { cpus = 64 memory = "400 GB" } }

env { NUMBA_CACHE_DIR='/ceph/project/cribbslab/shared/proj005/analyses/10X_ONT_PBMC_stimVSunstim/epi2me/2un2' } `

Please find my log file here:

nextflow_4.log

nrhorner commented 1 year ago

Hi @lscdanson. Could you try the following before running the workflow

export NUMBA_CACHE_DIR='/ceph/project/cribbslab/shared/proj005/analyses/10X_ONT_PBMC_stimVSunstim/epi2me/2un2'
lscdanson commented 1 year ago

Hi @nrhorner

It still returned the same error. Tried both running the above line of code directly on the prompt and within the slurm job script.

nextflow_5.log

nrhorner commented 1 year ago

Hi @lscdanson

I'm not sure why this is still happening. Could you try one more thing please? Would you be able to download the code from the repo and run it locally to see if that makes a difference. https://github.com/epi2me-labs/wf-single-cell, click the green code button then download zip and unzip it. You should see the directory wf-single-cell

Edit the wf-single-cell/nextflow.config by placing the NUMBA_CACHE_DIR entry at the bottom of the config so it looks like:

env {
    PYTHONNOUSERSITE = 1
    NUMBA_CACHE_DIR='/your_dir/'
}

Then run the wrokflow without supplying any -c configs

nextflow run wf-single-cell/ {the rest of your args}   
PT806 commented 1 year ago

Hi @nrhorner, I also met the same issue, and I tried the solutions that you mentioned above, but all didn't work. What else could we do for it now?

danielsj1-chop commented 12 months ago

I fixed this for by adding runOptions = '--writable-tmpfs' to my singularity profile.

Thanks, John

cihaterdogan commented 11 months ago

Hi @danielsj1-chop, I'm trying to run the pipeline as a job (with Slurm) on my institution's clusters as well, but I'm getting different errors. I would be grateful if you could share the script you used here. I'm specifically wondering where you added the "--writable-tmpfs" command. Thanks!

nrhorner commented 11 months ago

I fixed this for by adding runOptions = '--writable-tmpfs' to my singularity profile.

Thanks, John

Thanks @danielsj1-chop That's good to know, I might put this in a troubleshooting docs page

nrhorner commented 11 months ago

Hi @danielsj1-chop, I'm trying to run the pipeline as a job (with Slurm) on my institution's clusters as well, but I'm getting different errors. I would be grateful if you could share the script you used here. I'm specifically wondering where you added the "--writable-tmpfs" command. Thanks!

Hi @cihaterdogan

Create a new config file, let's call it singularity.config containing the following

profiles {
    singularity {
        singularity {
            enabled = true
            autoMounts = true
            runOptions = '--writable-tmpfs'
        }
    }
}

Then refer to it with -c like nextflow run .... -c singularity.config

nrhorner commented 8 months ago

@lscdanson Did you manage to fix this issue with any of the suggestions in this threads?

cihaterdogan commented 8 months ago

Hi @nrhorner ,

I'm writing this here, maybe it will help someone. About a month ago, I was able to run the pipeline as a job (with Slurm) on my institution's clusters without any problems with the following parameters.

module load singularity
module load java

### nextflow is installed to  "/path/for/nextflow"
### wf-single-cell-demo data is downloaded to the same folder "/path/for/nextflow"

mkdir -p /path/for/nextflow/singularity

export NXF_SINGULARITY_CACHEDIR=/path/for/nextflow/singularity
export SINGULARITY_TMPDIR=/path/for/nextflow/singularity
export NXF_HOME=/path/for/nextflow

OUTPUT=wf-single-cell-demo_results

./nextflow run epi2me-labs/wf-single-cell \
    -w ${OUTPUT}/workspace \
    -profile singularity \
    --max_threads 16 \
    --fastq wf-single-cell-demo/chr17.fq.gz \
    --kit_name 3prime \
    --kit_version v3 \
    --expected_cells 100 \
    --ref_genome_dir wf-single-cell-demo/ \
    --out_dir ${OUTPUT} \
    --merge_bam

However, when I try to analyze the demo data with the same parameters, I get the following error now.

Command error:
  WARNING: passwd file doesn't exist in container, not updating
  WARNING: group file doesn't exist in container, not updating
  .command.sh: line 3: fastcat: command not found
  .command.sh: line 3: bgzip: command not found
  .command.sh: line 3: bgzip: command not found

I can't figure out if this is due to an update to the wf-single-cell tool or the HPC system. I would be grateful for any advice.

nrhorner commented 6 months ago

Hi @cihaterdogan thanks for sharing your commands. Your latest error, if you're still experiencing it, might be fixed by cloning the repository and doing nextflow run absolute/path/to/downloaded_repo

If you're still having issues, please open another ticket

nrhorner commented 6 months ago

Closing this ticket as original issue was fixed