dnbaker / dashing

Fast and accurate genomic distances using HyperLogLog
GNU General Public License v3.0
160 stars 11 forks source link

'sketch::exception::ZlibError' #71

Closed aryakaul closed 3 years ago

aryakaul commented 3 years ago

Hello again! Thanks for helping with my other issue, I'm now encountering the following exception.

The command I run is:

start=$(date +%s)
log 'Generating union of SRR sketches' '...' 'STEP 2'
$DASHING union -p 6 -o $OUTPUT/unionofSRR.hll $OUTPUT/SRR*.hll
end=$(date +%s)
runtime=$(((end-start)/60))
log "Time Taken" "$runtime minutes"

In my job output I receive the following:

STEP 2 Generating union of SRR sketches ...
Dashing version: v0.5.6
terminate called without an active exception
/var/spool/slurmd/job28030963/slurm_script: line 318: 18202 Aborted   dashing union -p 6 -o ./test/unionofSRR.hll ./test/SRR*.hll
-> Time Taken 0 minutes

And when I view the SRR union file generated, I receive:

➜ dashing view unionofSRR.hll
Dashing version: v0.5.6
terminate called after throwing an instance of 'sketch::exception::ZlibError'
  what():  zlibError [file error][E:sketch/include/sketch/hll.h:1085:void sketch::hll::hllbase_t<HashStruct>::read(z_gzFile) [with HashStruct = sketch::hash::WangHash; z_gzFile = gzFile_s*]] Error reading from file

[1]    27242 abort      dashing view unionofSRR.hll

I've confirmed I can correctly generate unionofSRR.hll by running the same expression on the command line, so I believe it's some issue with calling the command in a job script. I am able to correctly generate sketches in a job through the dashing sketch command (done in a very similar way) so I'm not sure why dashing union would fail. Let me know if you can think of anything for me to try.

Best, Arya

dnbaker commented 3 years ago

Hi Arya,

Thanks for reporting this! I tracked it down, and the program wasn't setting the number of threads, but was instead reading the environmental variable OMP_NUM_THREADS, which was causing segfaults/wrong data. Can you give this another try?

I've corrected this here (https://github.com/dnbaker/dashing/pull/72) in the source code, and I'll have this updated in releases soon.

Best,

Daniel

aryakaul commented 3 years ago

Hey Daniel,

Been trying the new code but now running into a weird issue. The command and job output remain the same (terminate called without an active exception), but now when I run dashing view unionofSRR.hll I receive the following:

/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing)
/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing: /lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing)
/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing)
/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.22' not found (required by /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing)
/n/data1/hms/dbmi/baym/arya/tools/dashing/dashing: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing)

I thought this might be some difference between the interactive node I was running make dashing on and the job node I actually use to submit the command; however, even when I remake dashing in the job script I still receive that output. Let me know if there's anything I can try to help out.

Best, Arya

dnbaker commented 3 years ago

Hmm, I'm not sure why the problem remains -- have you run make clean before building?

That might help. Outside of that, you might need to compile on the node itself as you mention, something I've had to do on some clusters. (Compiling on a head node would explain the GLIBC/CXXABI version differences.)

aryakaul commented 3 years ago

make clean did resolve the GLIBC/CXXABI issues; however, I'm still running into the same issue.

➜  dashing view unionofSRR.hll
Dashing version: v0.5.6-5-g1687
terminate called after throwing an instance of 'sketch::exception::ZlibError'
  what():  zlibError [file error][E:sketch/include/sketch/hll.h:1085:void sketch::hll::hllbase_t<HashStruct>::read(z_gzFile) [with HashStruct = sketch::hash::WangHash; z_gzFile = gzFile_s*]] Error reading from file

[1]    23827 abort      /n/data1/hms/dbmi/baym/arya/tools/dashing/dashing view unionofSRR.hll

I have correctly pulled the updated branch correct? I'm just running make clean, git pull -f, and then make dashing. Not sure if I need to be git pulling in a recursive fashion.

Thank you!

dnbaker commented 3 years ago

You're right, you might need to update your submodules. git submodule update --init --recursive should do it, if that's the trick.

The error being thrown says that reading from the file is failing, which suggests that the created file was corrupted somehow. It might need to be re-generated? You could also look at the contents of the file in Python with gzip.open(path).read() for a sanity check.

aryakaul commented 3 years ago

Submodule updating fixed it! Thanks a bunch for all the the help Daniel!