marbl / merqury

k-mer based assembly evaluation
Other
280 stars 19 forks source link

meryl-lookup error #98

Closed SaiReddy-A closed 1 year ago

SaiReddy-A commented 1 year ago

Hi, I was using trimmed 10X reads to build the meryl dbs with the command meryl k=21 count $reads output prefix-db.meryl

And then I used merqury to generate stats for two assemblies $MERQURY/merqury.sh prefix-db.meryl $asm1 $asm2 prefix_merqury

I ran these commands on a SLURM job node with 32 CPUs and 250G memory.

The spectra-cn.log showing errors while generating the wig files. The output bed files are empty but not the wig files...


Estimating memory usage for 'asm1.0.meryl'.

 p         blocks   ent/blk             bits gigabytes (allowed: 250 GB)
-- -------------- --------- ---------------- ---------
10           1024         0     137462077125    16.003
11           2048         0     137461519524    16.003
12           4096         0     137461025411    16.003
13           8192         0     137460656226    16.003
14          16384         0     137460532801    16.003  (smallest)
15          32768         0     137460892704    16.003
16          65536         0     137462202879    16.003
17         131072         0     137465380830    16.003
18         262144         0     137472228797    16.004
-- -------------- --------- ---------------- ---------
           623137 total kmers

For 623137 distinct 21-mers (with 14 bits used for indexing and 28 bits for tags):
    0.000 GB memory for block indices -        16384 elements 20 bits wide)
    0.000 GB memory for block lengths -        16384 elements 10 bits wide)
    0.002 GB memory for kmer tags     -       623137 elements 28 bits wide)
    0.000 GB memory for kmer values   -       623137 elements  5 bits wide)
   16.000 GB memory for buffers
   16.003 GB memory

Memory required:  16.003 GB
Memory limit:     250.000 GB

Loading kmers from 'asm1.0.meryl' into lookup table.

 p         blocks   ent/blk             bits gigabytes (allowed: 250 GB)
-- -------------- --------- ---------------- ---------
10           1024         0     137462077125    16.003
11           2048         0     137461519524    16.003
12           4096         0     137461025411    16.003
13           8192         0     137460656226    16.003
14          16384         0     137460532801    16.003  (smallest)
15          32768         0     137460892704    16.003
16          65536         0     137462202879    16.003
17         131072         0     137465380830    16.003
18         262144         0     137472228797    16.004
-- -------------- --------- ---------------- ---------
           623137 total kmers

For 623137 distinct 21-mers (with 14 bits used for indexing and 28 bits for tags):
    0.000 GB memory for block indices -        16384 elements 20 bits wide)
    0.000 GB memory for block lengths -        16384 elements 10 bits wide)
    0.002 GB memory for kmer tags     -       623137 elements 28 bits wide)
    0.000 GB memory for kmer values   -       623137 elements  5 bits wide)
   16.000 GB memory for buffers
   16.003 GB memory

Counting size of buckets.

Summing bucket sizes.
  block indices are 20 bits wide -- sum lengths 639265 (including 16128 empty pointers)
  block lengths are 13 bits wide -- max length  4181

Setting pointers.

Will load 623137 kmers.  Skipping 0 (too low) and 0 (too high) kmers.

Allocating space for 639265 kmer positions.
  suffixes of  28 bits each ->     17899420 bits (  0.002 GB) in blocks of  32.000 MB
  values   of   5 bits each ->      3196325 bits (  0.000 GB) in blocks of  32.000 MB

Filling buckets.

Loaded 623137 kmers.  Skipped 0 (too low) and 0 (too high) kmers.

Opening inputs:
  'asm1.fa'

Opening outputs:
  '-'
setBit()--  ERROR: position=15778 > maximum available=0
meryl-lookup: utility/src/bits/bitArray-v1.H:86: void merylutil::bits::v1::bitArray::setBit(uint64, bool): Assertion `position < _maxBitAvail' failed.

Failed with 'Aborted'; backtrace (libbacktrace):
utility/src/system/system-stackTrace-v1.C::82 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
utility/src/bits/bitArray-v1.H::86 in _ZN9merylutil4bits2v18bitArray6setBitEmb()
meryl-lookup/dump.C::128 in processSequence()
utility/src/system/sweatShop-v1.C::308 in _ZN9sweatShop6workerEP15sweatShopWorker()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
/home/reddy7/software/merqury/eval/spectra-cn.sh: line 71: 16856 Aborted                 (core dumped) meryl-lookup -bed -sequence $asm_fa -mers ${asm}.0.meryl > ${asm}_only.bed

Estimating memory usage for 'asm1.0.meryl'.

 p         blocks   ent/blk             bits gigabytes (allowed: 250 GB)
-- -------------- --------- ---------------- ---------
10           1024         0     137462077125    16.003
11           2048         0     137461519524    16.003
12           4096         0     137461025411    16.003
13           8192         0     137460656226    16.003
14          16384         0     137460532801    16.003  (smallest)
15          32768         0     137460892704    16.003
16          65536         0     137462202879    16.003
17         131072         0     137465380830    16.003
18         262144         0     137472228797    16.004
-- -------------- --------- ---------------- ---------
           623137 total kmers

For 623137 distinct 21-mers (with 14 bits used for indexing and 28 bits for tags):
    0.000 GB memory for block indices -        16384 elements 20 bits wide)
    0.000 GB memory for block lengths -        16384 elements 10 bits wide)
    0.002 GB memory for kmer tags     -       623137 elements 28 bits wide)
    0.000 GB memory for kmer values   -       623137 elements  5 bits wide)
   16.000 GB memory for buffers
   16.003 GB memory

Memory required:  16.003 GB
Memory limit:     250.000 GB

Loading kmers from 'asm1.0.meryl' into lookup table.

 p         blocks   ent/blk             bits gigabytes (allowed: 250 GB)
-- -------------- --------- ---------------- ---------
10           1024         0     137462077125    16.003
11           2048         0     137461519524    16.003
12           4096         0     137461025411    16.003
13           8192         0     137460656226    16.003
14          16384         0     137460532801    16.003  (smallest)
15          32768         0     137460892704    16.003
16          65536         0     137462202879    16.003
17         131072         0     137465380830    16.003
18         262144         0     137472228797    16.004
-- -------------- --------- ---------------- ---------
           623137 total kmers

For 623137 distinct 21-mers (with 14 bits used for indexing and 28 bits for tags):
    0.000 GB memory for block indices -        16384 elements 20 bits wide)
    0.000 GB memory for block lengths -        16384 elements 10 bits wide)
    0.002 GB memory for kmer tags     -       623137 elements 28 bits wide)
    0.000 GB memory for kmer values   -       623137 elements  5 bits wide)
   16.000 GB memory for buffers
   16.003 GB memory

Counting size of buckets.

Summing bucket sizes.
  block indices are 20 bits wide -- sum lengths 639265 (including 16128 empty pointers)
  block lengths are 13 bits wide -- max length  4181

Setting pointers.

Will load 623137 kmers.  Skipping 0 (too low) and 0 (too high) kmers.

Allocating space for 639265 kmer positions.
  suffixes of  28 bits each ->     17899420 bits (  0.002 GB) in blocks of  32.000 MB
  values   of   5 bits each ->      3196325 bits (  0.000 GB) in blocks of  32.000 MB

Filling buckets.

Loaded 623137 kmers.  Skipped 0 (too low) and 0 (too high) kmers.

Opening inputs:
  'asm1.fa'

Opening outputs:
  '-'

Bye!  (0 seconds to initialize and 47 seconds to compute)
asm1_only.wig generated.

Getting the same error for asm2.

Could you please look into this. Thank you!!

Sai

brianwalenz commented 1 year ago

That's a confusing one! It is simultaneously reporting the sequence being processed is both length 0 and length >= 15778.

The command being run is, I think, meryl-lookup -bed -sequence asm1.fa -mers asm1.0.meryl. Does it fail when run manually? Can you isolate the sequence in asm1.fa that is causing the crash?

Are asm1.fa and asm1.0.meryl small enough to share, and can you share them? If not, the best I can do is add debug logging to meryl-lookup, then let you compile a debug version and run the tests for me.

SaiReddy-A commented 1 year ago

I ran it manually and it failed again. How do I share the data with you?

brianwalenz commented 1 year ago

If small enough, you can attach files to the issue. If too big for that, follow directions at https://canu.readthedocs.io/en/latest/faq.html#how-can-i-send-data-to-you to upload via FTP.

SaiReddy-A commented 1 year ago

I've shared the data via FTP (merqury_issue98.tar)

brianwalenz commented 1 year ago

Thanks! I have reproduced the crash here.

brianwalenz commented 1 year ago

Fixed! It was an embarrassingly trivial mistake, sorry.

I have NOT yet made a new release with the bugfix, so you'll need to compile from a github clone, either the master or v1.4-maintenance branch will work.

SaiReddy-A commented 1 year ago

Thanks!! It solved the problem.