PGScatalog / pgsc_calc

The Polygenic Score Catalog Calculator is a nextflow pipeline for polygenic score calculation
https://pgsc-calc.readthedocs.io/en/latest/
Apache License 2.0
107 stars 20 forks source link

matched_variants.txt.gz empty when pgsc_calc is ran with many scores, resulting in KeyError: 'CHR:POS:A0:A1' and pgscatalog-intersect failure. #361

Open Fiwx opened 1 month ago

Fiwx commented 1 month ago

Description of the bug

reference_variants.txt.gz is empty, containing only the header. This problem did not occur when running ~30 scores, but it occur when running ~100 scores. This causes the pgscatalog-intersect step to crash.

Command used and terminal output

Aug-13 16:22:29.056 [main] DEBUG nextflow.script.ScriptRunner - > Awaiting termination
Aug-13 16:22:29.056 [main] DEBUG nextflow.Session - Session await
Aug-13 16:22:29.738 [Actor Thread 45] DEBUG nextflow.container.SingularityCache - Singularity found local store for image=oras://ghcr.io/pgscatalog/zstd:2-beta-singularity; path=/home/user/singularity_containers/ghcr.io-pgscatalog-zstd-2-beta-singularity.img
Aug-13 16:22:29.740 [Actor Thread 23] INFO  nextflow.container.SingularityCache - Pulling Singularity image oras://ghcr.io/pgscatalog/pygscatalog:pgscatalog-utils-1.3.1-singularity [cache /home/user/singularity_containers/ghcr.io-pgscatalog-pygscatalog-pgscatalog-utils-1.3.1-singularity.img]
Aug-13 16:22:29.753 [Actor Thread 45] DEBUG nextflow.container.SingularityCache - Singularity found local store for image=oras://ghcr.io/pgscatalog/plink2:2.00a5.10-singularity; path=/home/user/singularity_containers/ghcr.io-pgscatalog-plink2-2.00a5.10-singularity.img
Aug-13 16:22:30.248 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
Aug-13 16:22:30.263 [Task submitter] INFO  nextflow.Session - [33/dec25d] Submitted process > PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_VCF (file127 chromosome ALL)
Aug-13 16:22:30.288 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
Aug-13 16:22:30.295 [Task submitter] INFO  nextflow.Session - [66/1acfd2] Submitted process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:EXTRACT_DATABASE(1)
Aug-13 16:22:44.159 [Actor Thread 23] DEBUG nextflow.container.SingularityCache - Singularity pull complete image=oras://ghcr.io/pgscatalog/pygscatalog:pgscatalog-utils-1.3.1-singularity path=/home/user/singularity_containers/ghcr.io-pgscatalog-pygscatalog-pgscatalog-utils-1.3.1-singularity.img
Aug-13 16:23:52.176 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 1; name: PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:EXTRACT_DATABASE (1); status: COMPLETED; exit: 0; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/66/1acfd293fb7055df2f78d51cb2c65c]
Aug-13 16:23:52.180 [Task monitor] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'TaskFinalizer' minSize=10; maxSize=12; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false
Aug-13 16:23:52.668 [TaskFinalizer-1] DEBUG nextflow.processor.TaskProcessor - Process PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:EXTRACT_DATABASE > Skipping output binding because one or more optional files are missing: fileoutparam<0:3>
Aug-13 16:23:52.669 [TaskFinalizer-1] DEBUG nextflow.processor.TaskProcessor - Process PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:EXTRACT_DATABASE > Skipping output binding because one or more optional files are missing: fileoutparam<1:1>
Aug-13 16:24:19.838 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 3; name: PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_VCF (file127 chromosome ALL); status: COMPLETED; exit: 0; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/33/dec25de8d4611e2a7081fff57c3fe6]
Aug-13 16:24:19.921 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
Aug-13 16:24:19.929 [Task submitter] INFO  nextflow.Session - [45/5a701e] Submitted process > PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1)
Aug-13 16:24:20.184 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
Aug-13 16:24:20.199 [Task submitter] INFO  nextflow.Session - [a1/025735] Submitted process > PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL)
Aug-13 16:27:26.583 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 2 -- submitted tasks are shown below
~> TaskHandler[id: 2; name: PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1); status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/45/5a701ed3fa296084261a1f939cc266]
~> TaskHandler[id: 4; name: PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL);status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/a1/0257359fca6824f76e34134b849344]
Aug-13 16:32:26.659 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 2 -- submitted tasks are shown below
~> TaskHandler[id: 2; name: PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1); status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/45/5a701ed3fa296084261a1f939cc266]
~> TaskHandler[id: 4; name: PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL);status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/a1/0257359fca6824f76e34134b849344]
Aug-13 16:37:26.755 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 2 -- submitted tasks are shown below
~> TaskHandler[id: 2; name: PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1); status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/45/5a701ed3fa296084261a1f939cc266]
~> TaskHandler[id: 4; name: PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL);status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/a1/0257359fca6824f76e34134b849344]
Aug-13 16:42:26.779 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 2 -- submitted tasks are shown below
~> TaskHandler[id: 2; name: PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1); status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/45/5a701ed3fa296084261a1f939cc266]
~> TaskHandler[id: 4; name: PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL);status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/a1/0257359fca6824f76e34134b849344]
Aug-13 16:44:17.528 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 4; name: PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL); status: COMPLETED; exit: 1; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/a1/0257359fca6824f76e34134b849344]
Aug-13 16:44:17.539 [TaskFinalizer-3] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL); work-dir=/home/user/runner/core/test/file_127.ancestry/work/a1/0257359fca6824f76e34134b849344
  error [nextflow.exception.ProcessFailedException]: Process `PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL)` terminated with an error exit status (1)
Aug-13 16:44:17.620 [TaskFinalizer-3] ERROR nextflow.processor.TaskProcessor - Error executing process > 'PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL)'

Caused by:
  Process `PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (file127 chromosome ALL)` terminated with an error exit status (1)

Command executed:

  pgscatalog-intersect --ref GRCh37_1000G_ALL.pvar.zst         --target GRCh37_file127_ALL.pvar.zst         --chrom ALL         --maf_target 0.0         --geno_miss 0.1         --outdir .         -v

  n_matched=$(sed -n '3p' intersect_counts_ALL.txt)

  if [ $n_matched == "0" ]
  then
      echo "ERROR: No variants in intersection"
      exit 1
  else
      mv matched_variants.txt.gz file127_ALL_matched.txt.gz
  fi

  cat <<-END_VERSIONS > versions.yml
  INTERSECT_VARIANTS:
      pgscatalog.match: $(echo $(python -c 'import pgscatalog.match; print(pgscatalog.match.__version__)'))
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:40:38 INFO     Processed 69000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:40:45 INFO     Processed 69500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:40:52 INFO     Processed 70000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:40:59 INFO     Processed 70500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:41:07 INFO     Processed 71000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:41:13 INFO     Processed 71500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:41:20 INFO     Processed 72000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:41:27 INFO     Processed 72500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:41:33 INFO     Processed 73000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:41:40 INFO     Processed 73500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:41:47 INFO     Processed 74000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:41:53 INFO     Processed 74500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:42:00 INFO     Processed 75000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:42:07 INFO     Processed 75500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:42:15 INFO     Processed 76000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:42:21 INFO     Processed 76500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:42:28 INFO     Processed 77000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:42:35 INFO     Processed 77500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:42:42 INFO     Processed 78000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:42:49 INFO     Processed 78500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:42:55 INFO     Processed 79000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:43:03 INFO     Processed 79500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:43:09 INFO     Processed 80000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:43:17 INFO     Processed 80500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:43:24 INFO     Processed 81000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:43:31 INFO     Processed 81500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:43:38 INFO     Processed 82000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:43:45 INFO     Processed 82500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:43:52 INFO     Processed 83000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:43:58 INFO     Processed 83500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:44:05 INFO     Processed 84000000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:44:13 INFO     Processed 84500000 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:44:17 INFO     Processed 84805772 REFERENCE variants
  pgscatalog.match.cli.intersect_cli: 2024-08-13 16:44:17 INFO     Outputting REFERNCE variants -> reference_variants.txt.gz
  Traceback (most recent call last):
    File "/app/pgscatalog.utils/.venv/bin/pgscatalog-intersect", line 8, in <module>
      sys.exit(run_intersect())
               ^^^^^^^^^^^^^^^
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/match/cli/intersect_cli.py", line 85, in run_intersect
      for v in heapq.merge(
    File "/usr/local/lib/python3.11/heapq.py", line 376, in merge
      h_append([key(value), order * direction, value, next])
                ^^^^^^^^^^
    File "/app/pgscatalog.utils/.venv/lib/python3.11/site-packages/pgscatalog/match/cli/intersect_cli.py", line 87, in <lambda>
      key=lambda v: (v["CHR:POS:A0:A1"], v["ID_REF"], v["REF_REF"]),
                     ~^^^^^^^^^^^^^^^^^
  KeyError: 'CHR:POS:A0:A1'
  cp: '.command.out' and '.command.out' are the same file
  cp: '.command.err' and '.command.err' are the same file
  cp: '.command.trace' and '.command.trace' are the same file

Work dir:
  /home/user/runner/core/test/file_127.ancestry/work/a1/0257359fca6824f76e34134b849344

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
Aug-13 16:44:17.692 [TaskFinalizer-3] INFO  nextflow.Session - Execution cancelled -- Finishing pending tasks before exit
Aug-13 16:44:17.753 [main] DEBUG nextflow.Session - Session await > all processes finished
Aug-13 16:44:17.867 [Actor Thread 56] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:PLINK2_MAKEBED_TARGET; work-dir=null
  error [java.lang.InterruptedException]: java.lang.InterruptedException
Aug-13 16:44:17.895 [Actor Thread 52] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=PGSCATALOG_PGSCCALC:PGSCCALC:REPORT:ANCESTRY_ANALYSIS; work-dir=null
  error [java.lang.InterruptedException]: java.lang.InterruptedException
Aug-13 16:44:17.981 [Actor Thread 60] DEBUG nextflow.sort.BigSort - Sort completed -- entries: 2; slices: 1; internal sort time: 0.064 s; external sort time: 0.002 s; total time: 0.066 s
Aug-13 16:44:17.993 [Actor Thread 60] DEBUG nextflow.file.FileCollector - Saved collect-files list to: /home/user/runner/core/test/file_127.ancestry/work/collect-file/a952058c44a348251f3ab72a876aec7b
Aug-13 16:44:18.016 [Actor Thread 60] DEBUG nextflow.file.FileCollector - Deleting file collector temp dir: /tmp/nxf-10103040858170102137
Aug-13 16:47:26.843 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 1 -- submitted tasks are shown below
~> TaskHandler[id: 2; name: PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1); status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/45/5a701ed3fa296084261a1f939cc266]
Aug-13 16:52:26.939 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 1 -- submitted tasks are shown below
~> TaskHandler[id: 2; name: PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1); status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/45/5a701ed3fa296084261a1f939cc266]
Aug-13 16:57:26.979 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 1 -- submitted tasks are shown below
~> TaskHandler[id: 2; name: PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1); status: RUNNING; exit: -; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/45/5a701ed3fa296084261a1f939cc266]
Aug-13 17:00:08.884 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 2; name: PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1); status: COMPLETED; exit: 0; error: -; workDir: /home/user/runner/core/test/file_127.ancestry/work/45/5a701ed3fa296084261a1f939cc266]
Aug-13 17:00:08.887 [Task monitor] DEBUG n.processor.TaskPollingMonitor - <<< barrier arrives (monitor: local) - terminating tasks monitor poll loop
Aug-13 17:00:08.889 [main] DEBUG nextflow.Session - Session await > all barriers passed
Aug-13 17:00:08.919 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'TaskFinalizer' shutdown completed (hard=false)
Aug-13 17:00:08.988 [main] INFO  nextflow.Nextflow - -[pgscatalog/pgsc_calc] Pipeline completed with errors-
Aug-13 17:00:09.021 [main] DEBUG n.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=3; failedCount=1; ignoredCount=0; cachedCount=0; pendingCount=0; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=1h 17m 52s; failedDuration=19m 57s; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=2; peakCpus=4; peakMemory=24 GB; ]
Aug-13 17:00:09.021 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow completed -- saving trace file
Aug-13 17:00:09.027 [main] DEBUG nextflow.trace.ReportObserver - Workflow completed -- rendering execution report
Aug-13 17:00:10.088 [main] DEBUG nextflow.trace.TimelineObserver - Workflow completed -- rendering execution timeline
Aug-13 17:00:10.480 [main] DEBUG nextflow.cache.CacheDB - Closing CacheDB done
Aug-13 17:00:10.532 [main] INFO  org.pf4j.AbstractPluginManager - Stop plugin 'nf-prov@1.2.2'
Aug-13 17:00:10.532 [main] DEBUG nextflow.plugin.BasePlugin - Plugin stopped nf-prov
Aug-13 17:00:10.532 [main] INFO  org.pf4j.AbstractPluginManager - Stop plugin 'nf-schema@2.0.0'
Aug-13 17:00:10.532 [main] DEBUG nextflow.plugin.BasePlugin - Plugin stopped nf-schema
Aug-13 17:00:10.533 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'FileTransfer' shutdown completed (hard=false)
Aug-13 17:00:10.555 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

Code: https://github.com/PGScatalog/pygscatalog/blob/main/pgscatalog.match/src/pgscatalog/match/cli/intersect_cli.py

    with xopen(outdir / "reference_variants.txt.gz", "wt") as outf:
        outf.write("CHR:POS:A0:A1\tID_REF\tREF_REF\tIS_INDEL\tSTRANDAMB\tIS_MA_REF\n")
        for v in heapq.merge(
            *[read_var_general(x) for x in o_tmp_r],
            key=lambda v: (v["CHR:POS:A0:A1"], v["ID_REF"], v["REF_REF"]),
        ):
            outf.write("\t".join(v.values()) + "\n")

Since reference_variants is empty, the error makes sense.

Data Processing: The script successfully processed 84,805,772 reference variants and wrote them to temporary files. The temporary files were correctly written with the expected structure, including the "CHR:POS:A0:A1" column.

Idea: When attempting to merge these temporary files and write to the final output, the system ran out of available RAM. This memory exhaustion caused the heapq.merge operation to fail silently, right after the file is opened and the header is written. As a result, only the header was written to the reference_variants.txt.gz file before the process was interrupted. Inspecting the Nextflow run did not confirm or reject this idea. Perhaps it needs more than 4 GB?

  1. PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_VCF task:

    • Allocated memory: 16 GB (17179869184 bytes)
    • Peak memory usage: 2.5 GB (2715578368 bytes) RSS, 16.3 GB (17425510400 bytes) VMEM
  2. PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:EXTRACT_DATABASE task:

    • Allocated memory: 8 GB (8589934592 bytes)
    • Peak memory usage: 9.6 MB (10066944 bytes) RSS
  3. PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES task:

    • Allocated memory: 16 GB (17179869184 bytes)
    • Peak memory usage: 9.5 GB (10200547328 bytes) RSS, 9.7 GB (10415800320 bytes) VMEM
  4. PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS task:

    • Allocated memory: 4 GB (4294967296 bytes)
    • Memory usage: Not available (task failed)
    • This task failed with exit code 1, so no memory usage was recorded.

Relevant files

reference_variants.txt only contains only the following:

CHR:POS:A0:A1   ID_REF  REF_REF IS_INDEL        STRANDAMB       IS_MA_REF

=== Contents of reference_variants.txt === CHR:POS:A0:A1 ID_REF REF_REF IS_INDEL STRANDAMB IS_MA_REF

=== Contents of GRCh37_1000G_ALL.psam ===

IID PAT MAT SEX SuperPop Population

HG00096 0 0 1 EUR GBR HG00097 0 0 2 EUR GBR HG00099 0 0 2 EUR GBR HG00100 0 0 2 EUR GBR

=== Contents of GRCh37_file127_ALL.afreq.gz ===

CHROM ID REF ALT ALT_FREQS OBS_CT

1 1:10642:G:A G A 0 2 1 1:11008:C:G C G 0 2 1 1:11012:C:G C G 0 2 1 1:11063:T:G T G 0 2

=== Contents of GRCh37_file127_ALL.vmiss.gz ===

ID F_MISS_DOSAGE F_MISS

1:10642:G:A 0 0 1:11008:C:G 0 0 1:11012:C:G 0 0 1:11063:T:G 0 0

=== Contents of GRCh37_1000G_ALL.pvar.zst ===

reference=ftp://ftp.1000genomes.ebi.ac.uk//vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz

contig=

contig=

contig=

contig=

=== Contents of GRCh37_file127_ALL.pvar.zst ===

CHROM POS ID REF ALT

1 10642 1:10642:G:A G A 1 11008 1:11008:C:G C G 1 11012 1:11012:C:G C G 1 11063 1:11063:T:G T G

--- Contents of GRCh37_1000G_ALL.psam ---

IID PAT MAT SEX SuperPop Population

HG00096 0 0 1 EUR GBR HG00097 0 0 2 EUR GBR HG00099 0 0 2 EUR GBR HG00100 0 0 2 EUR GBR

--- Contents of GRCh37_file127_ALL.psam ---

IID SEX

file_127.ancestry.txt NA

=== Contents of /tmp/tmpjo6asoom/tmpchbm3n41 === CHR:POS:A0:A1 ID_REF REF_REF IS_INDEL STRANDAMB IS_MA_REF 1:10000006:A:G 1:10000006:G:A G False False False 1:10000020:A:T 1:10000020:T:A T False True False 1:10000072:C:T 1:10000072:C:T C False False False 1:10000143:C:T 1:10000143:C:T C False False False 1:10000160:C:G 1:10000160:G:C G False True False 1:10000179:A:AAAAAAAC 1:10000179:AAAAAAAC:A AAAAAAAC True False False 1:10000185:A:C 1:10000185:A:C A False False False 1:10000186:C:G 1:10000186:C:G C False True False 1:10000228:C:T 1:10000228:T:C T False False False 1:10000236:C:T 1:10000236:T:C T False False False 1:10000283:A:G 1:10000283:G:A G False False False 1:10000302:A:T 1:10000302:T:A T False True False 1:10000320:C:T 1:10000320:C:T C False False False 1:10000327:C:T 1:10000327:C:T C False False False 1:10000354:C:T 1:10000354:C:T C False False False 1:10000371:A:T 1:10000371:A:T A False True False 1:1000037:A:G 1:1000037:A:G A False False False 1:10000396:A:G 1:10000396:A:G A False False False 1:10000400:A:T 1:10000400:T:A T False True False

System information

Information: pgscatalog/pgsc_calc v2.0.0-beta.3 profile : singularity CPUs: 4 - Mem: 31 GB (3.3 GB) - Swap: 0 (0) Nextflow version: 24.04.4

Fiwx commented 1 month ago

I believe there may be a problem with heapq or something similar in intersect_cli.py. Running the same command with ~20 scores worked fine.

smlmbrt commented 1 month ago

I believe there may be a problem with heapq or something similar in intersect_cli.py. Running the same command with ~20 scores worked fine.

This step with intersect_cli.py shouldn't be dependant on the number of scores, it sounds like it was just a random error or out of memory bug?

When attempting to merge these temporary files and write to the final output, the system ran out of available RAM. This memory exhaustion caused the heapq.merge operation to fail silently, right after the file is opened and the header is written. As a result, only the header was written to the reference_variants.txt.gz file before the process was interrupted. Inspecting the Nextflow run did not confirm or reject this idea. Perhaps it needs more than 4 GB?

We will look into that (cc @nebfield)

Fiwx commented 4 weeks ago

I don't know why, but I am able to replicate the error running on many scores, and it goes away with fewer scores. Perhaps some other step is using more resources in the background, but this task fails instead, if this step is truly not dependent on the number of scores?

smlmbrt commented 4 weeks ago

If you run the .command.run script in the failed job's work directory alone does it run to completion or fail?

Fiwx commented 3 weeks ago

The same error/ failure occurs with .command.run.

smlmbrt commented 3 weeks ago

The same error/ failure occurs with .command.run.

If you edit that script to request more memory does it solve the problem?

Fiwx commented 3 weeks ago

Is there a way to do this at the beginning of the run? Such as a configuration file that can be modified?

smlmbrt commented 3 weeks ago

https://github.com/PGScatalog/pgsc_calc/blob/96fbb2346978c12917d35dc68520a86a9f7b7cc0/modules/local/ancestry/intersect_variants.nf#L1-L5

Could replace the process_low label with process_high_memory.

Fiwx commented 3 weeks ago

Thank you; I will try that. I see the memory label in conf/base.config.

https://github.com/PGScatalog/pgsc_calc/blob/96fbb2346978c12917d35dc68520a86a9f7b7cc0/conf/base.config#L43

Then, in modules/local/ancestry/intersect_variants.nf, I will change:

process INTERSECT_VARIANTS {
    // labels are defined in conf/modules.config
    label 'process_single'
    label 'pgscatalog_utils' // controls conda, docker, + singularity options

to:

process INTERSECT_VARIANTS {
    // labels are defined in conf/modules.config
    label 'process_single'
    label 'pgscatalog_utils' // controls conda, docker, + singularity options
    label 'process_high_memory'