aidenlab / straw

Extract data quickly from Juicebox via straw
MIT License
61 stars 36 forks source link

Crash when trying to fetch a normalization that's not present #137

Open jonathancasper opened 1 month ago

jonathancasper commented 1 month ago

I'm actually looking for a way to efficiently retrieve a list of available normalization options in a hic file so that I can present them to a user, but in the process of exploring that task I found this problem.

Using the simple command-line straw utility as compiled from the C++ section of this repository, you can make a data request from a .hic file. In particular, you can make a request for data from a normalization option that doesn't exist (e.g. "straw observed NOTTHERE file.hic chr1:1:1 chr1:1:1 BP 5000").

With most .hic files, such a command results in a simple error message from the library: File did not contain NOTTHERE normalization vectors for one or both chromosomes at 5000 BP

If you issue this command on a hic file that doesn't have any normalization options besides NONE (I have one example from a user that was compiled with Juicer 2.20), however, then straw additionally crashes with the message

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

It looks like the problem is that at some point the library fails to recognize that it's being told to read 0 entries, goes ahead and retrieves something anyway, and badness compounds from there.

Any suggestions for getting around this, or else for an alternative way to discover which normalization options are available for a file?

Thanks!

jonathancasper commented 1 month ago

I should add - it looks like the footer of the .hic file ends right after getting through the expected values vector - instead of a value for nNormExpectedValueVectors, there's just the end of the file. Is that intended behavior when no normalization options are present, or can I mark this file as malformed?