PGScatalog / pgsc_calc

The Polygenic Score Catalog Calculator is a nextflow pipeline for polygenic score calculation
https://pgsc-calc.readthedocs.io/en/latest/
Apache License 2.0
113 stars 21 forks source link

The error of MATCH_VARIANTS in profile test #383

Open yambirj opened 1 week ago

yambirj commented 1 week ago

Description of the bug

Hello! Firstly, thank you for the tool. I used it long time ago, and now tried to install the newest version (v2.0.0-beta.3) using

nextflow run pgscatalog/pgsc_calc -profile test,singularity -r v2.0.0-beta.3

but faced an error. I tried to remove pgsc_calc from nextflow assets and then run nextflow run pgscatalog/pgsc_calc -profile test,singularity again, but it did not work.

Interestingly, this error appears at every version above v2.0.0-alpha.5 (which also was the only one for which test profile worked without errors). The versions below v2.0.0-alpha.5 seem to have an error with SCORE_REPORT, but I am more interested in the newest version.

Command used and terminal output

$ nextflow run pgscatalog/pgsc_calc -profile test,singularity

 N E X T F L O W   ~  version 24.04.4

Pulling pgscatalog/pgsc_calc ...
 downloaded from https://github.com/PGScatalog/pgsc_calc.git
Launching `https://github.com/pgscatalog/pgsc_calc` [high_newton] DSL2 - revision: 9bd9c431e7 [main]

INFO: The test profile is used to install the workflow and verify the software is working correctly on your system.
INFO: Test input data and results are are only useful as examples of outputs, and are not biologically meaningful.

------------------------------------------------------
  pgscatalog/pgsc_calc v2.0.0-beta.3-g9bd9c43
------------------------------------------------------
Core Nextflow options
  revision       : main
  runName        : high_newton
  containerEngine: singularity
  launchDir      : /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc
  workDir        : /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work
  projectDir     : /home_beegfs/ivanna01/.nextflow/assets/pgscatalog/pgsc_calc
  userName       : ivanna01
  profile        : test,singularity
  configFiles    :

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use pgscatalog/pgsc_calc for your analysis please cite:

* The Polygenic Score Catalog
  https://doi.org/10.1101/2024.05.29.24307783
  https://doi.org/10.1038/s41588-021-00783-5

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/pgscatalog/pgsc_calc/blob/main/CITATIONS.md

[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES     -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM  -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES     -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM  -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR [  0%] 0 of 1
executor >  local (1)
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES                            -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                         -
executor >  local (1)
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES                            -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                         -
executor >  local (2)
[af/1e1b79] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [  0%] 0 of 1
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                         -
executor >  local (2)
[af/1e1b79] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                         -
executor >  local (3)
[af/1e1b79] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                         -
executor >  local (3)
[af/1e1b79] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                         -
[1c/dbeabd] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (cineca chromosome 22) [100%] 1 of 1 ✔
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_VCF                                -
[17/6a3a10] PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)               [  0%] 0 of 1
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_COMBINE                                       -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:PLINK2_SCORE                                  -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:SCORE_AGGREGATE                               -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:REPORT:SCORE_REPORT                                       -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:DUMPSOFTWAREVERSIONS                                      -
Pulling Singularity image oras://ghcr.io/pgscatalog/pygscatalog:pgscatalog-utils-1.3.1-singularity [cache /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/singularity/ghcr.io-pgscatalog-pygscatalog-pgscatalog-utils-1.3.1-singularity.img]
Pulling Singularity image oras://ghcr.io/pgscatalog/plink2:2.00a5.10-singularity [cache /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/singularity/ghcr.io-pgscatalog-plink2-2.00a5.10-singularity.img]
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/singularity -- Use the environment variable NXF_SINGULARITY_CACHEDIR to specify a different location
ERROR ~ Error executing process > 'PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)'

Caused by:
  Process `PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)` terminated with an error exit status (1)

Command executed:

  export POLARS_MAX_THREADS=2

  pgscatalog-match                  --dataset cineca         --scorefile scorefiles.txt.gz         --target GRCh37_cineca_22.pvar.zst         --only_match         --chrom 22                           --outdir $PWD         -v

  cat <<-END_VERSIONS > versions.yml
  MATCH_VARIANTS:
      pgscatalog.match: $(echo $(python -c 'import pgscatalog.match; print(pgscatalog.match.__version__)'))
executor >  local (3)
[af/1e1b79] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                         -
[1c/dbeabd] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (cineca chromosome 22) [100%] 1 of 1 ✔
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_VCF                                -
[17/6a3a10] PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)               [100%] 1 of 1, failed: 1 ✘
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_COMBINE                                       -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:PLINK2_SCORE                                  -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:SCORE_AGGREGATE                               -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:REPORT:SCORE_REPORT                                       -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:DUMPSOFTWAREVERSIONS                                      -
Pulling Singularity image oras://ghcr.io/pgscatalog/pygscatalog:pgscatalog-utils-1.3.1-singularity [cache /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/singularity/ghcr.io-pgscatalog-pygscatalog-pgscatalog-utils-1.3.1-singularity.img]
Pulling Singularity image oras://ghcr.io/pgscatalog/plink2:2.00a5.10-singularity [cache /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/singularity/ghcr.io-pgscatalog-plink2-2.00a5.10-singularity.img]
Execution cancelled -- Finishing pending tasks before exit
-[pgscatalog/pgsc_calc] Pipeline completed with errors-
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/singularity -- Use the environment variable NXF_SINGULARITY_CACHEDIR to specify a different location
ERROR ~ Error executing process > 'PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)'

Caused by:
  Process `PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)` terminated with an error exit status (1)

Command executed:

  export POLARS_MAX_THREADS=2

  pgscatalog-match                  --dataset cineca         --scorefile scorefiles.txt.gz         --target GRCh37_cineca_22.pvar.zst         --only_match         --chrom 22                           --outdir $PWD         -v

  cat <<-END_VERSIONS > versions.yml
  MATCH_VARIANTS:
      pgscatalog.match: $(echo $(python -c 'import pgscatalog.match; print(pgscatalog.match.__version__)'))
executor >  local (3)
[af/1e1b79] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)                        [100%] 1 of 1 ✔
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM                         -
[1c/dbeabd] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR (cineca chromosome 22) [100%] 1 of 1 ✔
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_VCF                                -
[17/6a3a10] PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)               [100%] 1 of 1, failed: 1 ✘
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_COMBINE                                       -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:PLINK2_SCORE                                  -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:SCORE_AGGREGATE                               -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:REPORT:SCORE_REPORT                                       -
[-        ] PGSCATALOG_PGSCCALC:PGSCCALC:DUMPSOFTWAREVERSIONS                                      -
Pulling Singularity image oras://ghcr.io/pgscatalog/pygscatalog:pgscatalog-utils-1.3.1-singularity [cache /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/singularity/ghcr.io-pgscatalog-pygscatalog-pgscatalog-utils-1.3.1-singularity.img]
Pulling Singularity image oras://ghcr.io/pgscatalog/plink2:2.00a5.10-singularity [cache /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/singularity/ghcr.io-pgscatalog-plink2-2.00a5.10-singularity.img]
Execution cancelled -- Finishing pending tasks before exit
-[pgscatalog/pgsc_calc] Pipeline completed with errors-
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/singularity -- Use the environment variable NXF_SINGULARITY_CACHEDIR to specify a different location
ERROR ~ Error executing process > 'PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)'

Caused by:
  Process `PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (cineca chromosome 22)` terminated with an error exit status (1)

Command executed:

  export POLARS_MAX_THREADS=2

  pgscatalog-match                  --dataset cineca         --scorefile scorefiles.txt.gz         --target GRCh37_cineca_22.pvar.zst         --only_match         --chrom 22                           --outdir $PWD         -v

  cat <<-END_VERSIONS > versions.yml
  MATCH_VARIANTS:
      pgscatalog.match: $(echo $(python -c 'import pgscatalog.match; print(pgscatalog.match.__version__)'))
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  pgscatalog.match.cli.match_cli: 2024-10-14 10:52:52 WARNING  No output format specified, writing to combined scoring file
  pgscatalog.match.cli.match_cli: 2024-10-14 10:52:52 DEBUG    Verbose logging enabled
  pgscatalog.match.cli.match_cli: 2024-10-14 10:52:52 INFO     --cleanup set (default), temporary files will be deleted
  pgscatalog.match.lib.scoringfileframe: 2024-10-14 10:52:52 DEBUG    Converting ScoringFileFrame(NormalisedScoringFile('scorefiles.txt.gz')) to feather format
  pgscatalog.match.lib.scoringfileframe: 2024-10-14 10:52:52 DEBUG    ScoringFileFrame(NormalisedScoringFile('scorefiles.txt.gz')) feather conversion complete
  pgscatalog.match.lib._match.preprocess: 2024-10-14 10:52:52 DEBUG    Complementing column effect_allele
  pgscatalog.match.lib._match.preprocess: 2024-10-14 10:52:52 DEBUG    Complementing column other_allele
  pgscatalog.match.lib.scoringfileframe: 2024-10-14 10:52:52 DEBUG    Filtering scoring file to chromosome 22
  pgscatalog.match.lib.variantframe: 2024-10-14 10:52:52 DEBUG    Converting VariantFrame(path='GRCh37_cineca_22.pvar.zst', dataset='cineca', chrom='22', cleanup=True, tmpdir=PosixPath('tmp')) to feather format
  pgscatalog.match.lib.variantframe: 2024-10-14 10:52:52 DEBUG    VariantFrame(path='GRCh37_cineca_22.pvar.zst', dataset='cineca', chrom='22', cleanup=True, tmpdir=PosixPath('tmp')) feather conversion complete
  pgscatalog.match.lib._match.preprocess: 2024-10-14 10:52:52 DEBUG    Filtering target to include chromosomes 1 - 22, X, Y
  pgscatalog.match.lib._match.preprocess: 2024-10-14 10:52:52 DEBUG    No multiallelic variants detected
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Getting matches for scores with effect allele and other allele
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Matching strategy: refalt
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Matching strategy: altref
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Matching strategy: refalt_flip
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Matching strategy: altref_flip
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Getting matches for scores with effect allele only
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Matching strategy: no_oa_ref
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Matching strategy: no_oa_alt
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Matching strategy: no_oa_ref_flip
  pgscatalog.match.lib._match.match: 2024-10-14 10:52:52 DEBUG    Matching strategy: no_oa_alt_flip
  pgscatalog.match.cli.match_cli: 2024-10-14 10:52:52 INFO     Renaming matchtmp/tmpvuo7hmjh to 0.ipc.zst
  Traceback (most recent call last):
    File "/app/pgscatalog.utils/.venv/bin/pgscatalog-match", line 8, in <module>
      sys.exit(run_match())
               ^^^^^^^^^^^
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/match/cli/match_cli.py", line 107, in run_match
      ipc_path.rename(out_path)
    File "/usr/local/lib/python3.11/pathlib.py", line 1175, in rename
      os.rename(self, target)
  OSError: [Errno 16] Device or resource busy: 'matchtmp/tmpvuo7hmjh' -> '0.ipc.zst'

Work dir:
  /mnt/beegfs2/home/ivanna01/Apps/failed_pgsc/work/17/6a3a10e6a5edb47a875aadbdf4a768

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details

Relevant files

nextflow.log

System information

I am working on HPC with: nextflow 24.04.4 singularity 3.11.4

nebfield commented 4 days ago

I think this is a problem with your local system rather than a bug with the workflow, because:

  OSError: [Errno 16] Device or resource busy: 'matchtmp/tmpvuo7hmjh' -> '0.ipc.zst'

From the Python docs:

This exception is raised when a system function returns a system-related error, including I/O failures such as “file not found” or “disk full” (not for illegal argument types or other incidental errors).

Perhaps there was a temporary problem with the storage system. Does the problem happen consistently if you retry? If you're running on a HPC your system administrators might be able to provide more support.