dnbaker / dashing

Fast and accurate genomic distances using HyperLogLog
GNU General Public License v3.0
161 stars 11 forks source link

Union command failing #68

Closed aryakaul closed 3 years ago

aryakaul commented 3 years ago

Hello! I'm trying to use the tool and was interested in the union subcommand.

I first create sketches of 2 read files using -k31 -S10. Then, I run

dashing union ./achromobacter_xylosoxidans__01/DRR015625.fa.gz.w.31.spacing.10.hll ./achromobacter_xylosoxidans__01/DRR015626.fa.gz.w.31.spacing.10.hll -o test.hll

And I receive the following error

terminate called after throwing an instance of 'std::runtime_error'
  what():  Could not open file at '@r' for reading
[1]    27344 abort      /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing union

If I keep rerunning the same command I receive different strings for @r and sometimes receive the following backtrace:

*** Error in `/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing': double free or corruption (out): 0x0000000001e9d250 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81679)[0x7f3ad2c52679]
/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing[0x4f78b6]
/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing[0x4eac84]
/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing[0x40c605]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f3ad2bf3505]
/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing[0x41d0ae]
======= Memory map: ========
00400000-009ae000 r-xp 00000000 00:34 52331274504                        /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing
00bad000-00bae000 r--p 005ad000 00:34 52331274504                        /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing
00bae000-00bb1000 rw-p 005ae000 00:34 52331274504                        /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing
00bb1000-00bb2000 rw-p 00000000 00:00 0
01e6d000-01eaf000 rw-p 00000000 00:00 0                                  [heap]
7f3acc000000-7f3acc021000 rw-p 00000000 00:00 0
7f3acc021000-7f3ad0000000 ---p 00000000 00:00 0
7f3ad2bd1000-7f3ad2d94000 r-xp 00000000 fd:00 12583521                   /usr/lib64/libc-2.17.so
7f3ad2d94000-7f3ad2f94000 ---p 001c3000 fd:00 12583521                   /usr/lib64/libc-2.17.so
7f3ad2f94000-7f3ad2f98000 r--p 001c3000 fd:00 12583521                   /usr/lib64/libc-2.17.so
7f3ad2f98000-7f3ad2f9a000 rw-p 001c7000 fd:00 12583521                   /usr/lib64/libc-2.17.so
7f3ad2f9a000-7f3ad2f9f000 rw-p 00000000 00:00 0
7f3ad2f9f000-7f3ad2fb6000 r-xp 00000000 fd:00 12583547                   /usr/lib64/libpthread-2.17.so
7f3ad2fb6000-7f3ad31b5000 ---p 00017000 fd:00 12583547                   /usr/lib64/libpthread-2.17.so
7f3ad31b5000-7f3ad31b6000 r--p 00016000 fd:00 12583547                   /usr/lib64/libpthread-2.17.so
7f3ad31b6000-7f3ad31b7000 rw-p 00017000 fd:00 12583547                   /usr/lib64/libpthread-2.17.so
7f3ad31b7000-7f3ad31bb000 rw-p 00000000 00:00 0
7f3ad31bb000-7f3ad31d1000 r-xp 00000000 00:2b 5898226537                 /n/app/gcc/6.2.0/lib64/libgcc_s.so.1
7f3ad31d1000-7f3ad33d0000 ---p 00016000 00:2b 5898226537                 /n/app/gcc/6.2.0/lib64/libgcc_s.so.1
7f3ad33d0000-7f3ad33d1000 r--p 00015000 00:2b 5898226537                 /n/app/gcc/6.2.0/lib64/libgcc_s.so.1
7f3ad33d1000-7f3ad33d2000 rw-p 00016000 00:2b 5898226537                 /n/app/gcc/6.2.0/lib64/libgcc_s.so.1
7f3ad33d2000-7f3ad33fe000 r-xp 00000000 00:2b 5898355825                 /n/app/gcc/6.2.0/lib64/libgomp.so.1.0.0
7f3ad33fe000-7f3ad35fd000 ---p 0002c000 00:2b 5898355825                 /n/app/gcc/6.2.0/lib64/libgomp.so.1.0.0
7f3ad35fd000-7f3ad35fe000 r--p 0002b000 00:2b 5898355825                 /n/app/gcc/6.2.0/lib64/libgomp.so.1.0.0
7f3ad35fe000-7f3ad35ff000 rw-p 0002c000 00:2b 5898355825                 /n/app/gcc/6.2.0/lib64/libgomp.so.1.0.0
7f3ad35ff000-7f3ad3700000 r-xp 00000000 fd:00 12583529                   /usr/lib64/libm-2.17.so
7f3ad3700000-7f3ad38ff000 ---p 00101000 fd:00 12583529                   /usr/lib64/libm-2.17.so
7f3ad38ff000-7f3ad3900000 r--p 00100000 fd:00 12583529                   /usr/lib64/libm-2.17.so
7f3ad3900000-7f3ad3901000 rw-p 00101000 fd:00 12583529                   /usr/lib64/libm-2.17.so
7f3ad3901000-7f3ad3a72000 r-xp 00000000 00:2b 5898402838                 /n/app/gcc/6.2.0/lib64/libstdc++.so.6.0.22
7f3ad3a72000-7f3ad3c72000 ---p 00171000 00:2b 5898402838                 /n/app/gcc/6.2.0/lib64/libstdc++.so.6.0.22
7f3ad3c72000-7f3ad3c7c000 r--p 00171000 00:2b 5898402838                 /n/app/gcc/6.2.0/lib64/libstdc++.so.6.0.22
7f3ad3c7c000-7f3ad3c7e000 rw-p 0017b000 00:2b 5898402838                 /n/app/gcc/6.2.0/lib64/libstdc++.so.6.0.22
7f3ad3c7e000-7f3ad3c82000 rw-p 00000000 00:00 0
7f3ad3c82000-7f3ad3c97000 r-xp 00000000 fd:00 12584762                   /usr/lib64/libz.so.1.2.7
7f3ad3c97000-7f3ad3e96000 ---p 00015000 fd:00 12584762                   /usr/lib64/libz.so.1.2.7
7f3ad3e96000-7f3ad3e97000 r--p 00014000 fd:00 12584762                   /usr/lib64/libz.so.1.2.7
7f3ad3e97000-7f3ad3e98000 rw-p 00015000 fd:00 12584762                   /usr/lib64/libz.so.1.2.7
7f3ad3e98000-7f3ad3e9a000 r-xp 00000000 fd:00 12583527                   /usr/lib64/libdl-2.17.so
7f3ad3e9a000-7f3ad409a000 ---p 00002000 fd:00 12583527                   /usr/lib64/libdl-2.17.so
7f3ad409a000-7f3ad409b000 r--p 00002000 fd:00 12583527                   /usr/lib64/libdl-2.17.so
7f3ad409b000-7f3ad409c000 rw-p 00003000 fd:00 12583527                   /usr/lib64/libdl-2.17.so
7f3ad409c000-7f3ad40be000 r-xp 00000000 fd:00 12583514                   /usr/lib64/ld-2.17.so
7f3ad42a4000-7f3ad42ac000 rw-p 00000000 00:00 0
7f3ad42bb000-7f3ad42bd000 rw-p 00000000 00:00 0
7f3ad42bd000-7f3ad42be000 r--p 00021000 fd:00 12583514                   /usr/lib64/ld-2.17.so
7f3ad42be000-7f3ad42bf000 rw-p 00022000 fd:00 12583514                   /usr/lib64/ld-2.17.so
7f3ad42bf000-7f3ad42c0000 rw-p 00000000 00:00 0
7fff7bf18000-7fff7bf3a000 rw-p 00000000 00:00 0                          [stack]
7fff7bfc1000-7fff7bfc3000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
[1]    29275 abort      /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing union

This occurs when using the most recent precompiled release version of dashing or when I compile from source.

Any help would be appreciated!

dnbaker commented 3 years ago

Hi there!

I wonder if it might be related to the flags. If I put -o before the input arguments, it works (in single-threaded mode) as expected, but if I put it after, it crashes:

./dashing union -p1 `cat fns.txt | tr '\n' ' ' | head -n4` -ofile.txt
Dashing version: v0.5.5-4-gb11d
terminate called after throwing an instance of 'std::runtime_error'
  what():  Could not open file at '-ofile.txt' for reading
Abort trap: 6

So changing the order of these might help.

However, in the process, I found another error in the union command with using more than one thread, which I've now corrected in source code, and I'll have new statically-linked binaries for OSX and Linux tomorrow sometime.

Thanks for reporting the issue, and feel free to ask more!

Daniel

aryakaul commented 3 years ago

Thanks Daniel! I tried it again with the correct order of flags and adding a -p1 but am still receiving the above error. I can try testing the new version when it's merged to main.

Also, just a question on usage.

Is it true that when the help message for a subcommand is condensed i.e. dashing union -h that some of the flags from the more in-depth subcommand help messages i.e. dashing dist -h are also included for that subcommand?

dnbaker commented 3 years ago

I've just merged it with this pull request. Would you give it a try, maybe with the new binaries here?

For the union subcommand, there are a lot fewer options than dist, so it should be simpler. Are there some options you're interested in?

aryakaul commented 3 years ago

Just tested and it's working!

I meant when running the dashing union -h option, the message doesn't specify that the -p option is available for this subcommand. I thought that meant multithreading wasn't implemented for union yet. Is there some way to know which flags are available for a given subcommand?

dnbaker commented 3 years ago

Hi again!

Sorry it took a while to get back to you on this. I had removed the feature quickly to get it patched, but I've since (https://github.com/dnbaker/dashing/releases/tag/v0.5.6) added a release which re-introduces the parallel reduction, and added the flag back in.

There are some undocumented features, but I wouldn't recommend using them as they're probably experimental.

Thanks!

Daniel