nf-core / hic

Analysis of Chromosome Conformation Capture data (Hi-C)
https://nf-co.re/hic
MIT License
81 stars 55 forks source link

get_valid_interaction fails with exitcode 137 with --digestion arima #113

Open heuermh opened 2 years ago

heuermh commented 2 years ago

There isn't much context provided with the following error

$ nextflow run main.nf \
  -profile docker \
  --genome GRCh38 \
  --digestion arima \
  --input '/home/ec2-user/*{1,2}.fastq.gz'

...

Error executing process > 'get_valid_interaction (sample)'

Caused by:
  Process `get_valid_interaction (sample)` terminated with an error exit status (137)

Command executed:

  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
  sort -k2,2V -k3,3n -k5,5V -k6,6n -o sample_bwt2pairs.validPairs sample_bwt2pairs.validPairs

Command exit status:
  137

Command output:
  (empty)

Command error:
  .command.sh: line 2:
    27 Killed                  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all

...

$ cat .exitcode
137

$ cat .command.err
/home/ec2-user/hic/work/19/3e1cb12b33b4f95171a7c8141e2566/.command.sh: line 2:
    27 Killed                  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all

Might you happen to know of any test datasets that use the ARIMA protocol, so that I may create a reproducible case?

Check Documentation

I have checked the following places for your error:

Description of the bug

Steps to reproduce

Steps to reproduce the behaviour:

  1. Command line:
  2. See error:

Expected behaviour

Log files

Have you provided the following extra information/files:

System

Nextflow Installation

Container engine

Additional context

heuermh commented 2 years ago

Previous error was with 100k and 1 million reads, with 10 million reads I get the same error code and then on retry

Error executing process > 'get_valid_interaction (sample)'

Caused by:
  Process `get_valid_interaction (sample)` terminated with an error exit status (1)

Command executed:

  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
  sort -k2,2V -k3,3n -k5,5V -k6,6n -o sample_bwt2pairs.validPairs sample_bwt2pairs.validPairs

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/home/ec2-user/hic/bin/mapped_2hic_fragments.py", line 566, in <module>
      resFrag = timing(load_restriction_fragment, fragmentFile, minFragSize, maxFragSize, verbose)
    File "/home/ec2-user/hic/bin/mapped_2hic_fragments.py", line 68, in timing
      result = function(*args)
    File "/home/ec2-user/hic/bin/mapped_2hic_fragments.py", line 206, in load_restriction_fragment
      for line in bed_handle:
  OSError: [Errno 12] Cannot allocate memory
  .command.run: fork: Cannot allocate memory
  .command.sh: line 2:    27 Killed                  mapped_2hic_fragments.py -f restriction_fragments.bed -r sample_bwt2pairs.bam --all
  .command.run: line 155: kill: (25) - No such process

Work dir:
  /home/ec2-user/hic/work/4c/decc189d16592544d8c91d6615afa7

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
nservant commented 2 years ago

Hi, Thanks for the report. The get_valid_interaction process is configured with 4Go of RAM. And in practice, only the restriction fragments file is loaded into memory.

According to the error, it seems that it cannot allocate memory ! Do you have enough RAM on your machine ? Otherwise, what is the size of the reference fragment file ? Thanks

heuermh commented 2 years ago

According to the error, it seems that it cannot allocate memory ! Do you have enough RAM on your machine ?

This was running in Nextflow local mode on an EC2 instance with 32G RAM.

Otherwise, what is the size of the reference fragment file ?

Didn't catch the size of this file; these runs were using the iGenomes GRCh38 reference.

I'll try running on AWS Batch next and see if increasing the process requirements for get_valid_interaction helps.

heuermh commented 2 years ago

I could not get it running in Tower with the default (4G RAM) or with 16G RAM

   withName:get_valid_interaction {
      memory = 16.GB
   }

nor with hg19 instead of GRCh38 as the reference.

heuermh commented 2 years ago

Both -profile test,docker and -profile test_full,docker run fine for me both in local mode and on Tower, so I'm thinking there is either an issue with arima digestion or with the input reads I'm trying to use.

Might you happen to know of any test datasets that use the ARIMA protocol?

nservant commented 2 years ago

None of the test uses the ARIMA protocol ... but the only difference would be the digestion and the resolution/numbers of restriction fragments. I can try to do more test on my side too. Would you have any public Hi-C dataset using ARIMA kits in mind ?