PGScatalog / pgsc_calc

The Polygenic Score Catalog Calculator is a nextflow pipeline for polygenic score calculation
https://pgsc-calc.readthedocs.io/en/latest/
Apache License 2.0
117 stars 21 forks source link

Pipeline doesn't find compressed PLINK2 data #93

Closed bnwolford closed 1 year ago

bnwolford commented 1 year ago

Description of the bug

My .pvar file is zipped so it has the suffix .zst so it seems the pipeline cannot find it. Do I need to run the pipeline on uncompressed PLINK2 files?

Command used and terminal output

nextflow run pgscatalog/pgscalc     -profile singularity     --input samplesheet.csv     --pgs_id PGS001229     --trait_efo EFO_0001645     --pgp_id PGP000001     --target_build GRCh37
N E X T F L O W  ~  version 22.10.6
Launching `https://github.com/pgscatalog/pgscalc` [kickass_perlman] DSL2 - revision: c42dd1dee7 [main]

------------------------------------------------------
  pgscatalog/pgsc_calc v1.3.2
------------------------------------------------------
Core Nextflow options
  revision       : main
  runName        : kickass_perlman
  containerEngine: singularity
  launchDir      : /mnt/scratch/brooke/pgs
  workDir        : /mnt/scratch/brooke/pgs/work
  projectDir     : /home/bwolford/.nextflow/assets/pgscatalog/pgscalc
  userName       : bwolford
  profile        : singularity
  configFiles    : /home/bwolford/.nextflow/assets/pgscatalog/pgscalc/nextflow.config

Input/output options
  input          : samplesheet.csv
  pgs_id         : PGS001229
  pgp_id         : PGP000001
  trait_efo      : EFO_0001645
  target_build   : GRCh37
  genotypes_cache: null

Institutional config options
  hostnames      : [:]

Max job request options
  max_cpus       : 2
  max_memory     : 16.GB

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use pgscatalog/pgsc_calc for your analysis please cite:

* The Polygenic Score Catalog
  https://doi.org/10.1038/s41588-021-00783-5

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/pgscatalog/pgsc_calc/blob/master/CITATIONS.md
------------------------------------------------------
executor >  local (2)
[8a/9c87b4] process > PGSCATALOG_PGSCALC:PGSCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS0012... [  0%] 0 of 1
[21/5be079] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (sample... [100%] 1 of 1 ✔
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES          -
executor >  local (2)
[8a/9c87b4] process > PGSCATALOG_PGSCALC:PGSCALC:DOWNLOAD_SCOREFILES ([pgs_id:PGS0012... [  0%] 0 of 1
[21/5be079] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:SAMPLESHEET_JSON (sample... [100%] 1 of 1 ✔
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:INPUT_CHECK:COMBINE_SCOREFILES          -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM       -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR      -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:PLINK2_VCF              -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:MATCH_VARIANTS          -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:MAKE_COMPATIBLE:MATCH_COMBINE           -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:APPLY_SCORE:PLINK2_SCORE                -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:APPLY_SCORE:SCORE_AGGREGATE             -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:APPLY_SCORE:SCORE_REPORT                -
[-        ] process > PGSCATALOG_PGSCALC:PGSCALC:DUMPSOFTWAREVERSIONS                    -
No such file: /home/bwolford/archive/pgen/h234_hrc_chr23_chunk1.pvar

 -- Check script '/home/bwolford/.nextflow/assets/pgscatalog/pgscalc/./workflows/../subworkflows/local/input_check.nf' at line: 128 or see '.nextflow.log' file for more details
ERROR: No scores calculated!

WARN: Killing running tasks (1)

Corruption:
    Descriptor does not contain a meta-nextfile entry
    Descriptor does not contain a meta-lognumber entry
    Descriptor does not contain a last-sequence-number entry

Relevant files

No response

System information

No response

nebfield commented 1 year ago

v1.3.2 needs the extra parameter --vzs if you're using compressed variant information files.

We should auto-detect compressed data though!