Astroua / CARTAvis

Deprecated Repository for CARTA project. Refer to:
https://github.com/cartavis/carta
GNU General Public License v2.0
2 stars 7 forks source link

Both master and develop branches crash with bad_alloc when opening large FITS files #185

Closed confluence closed 8 years ago

confluence commented 8 years ago

We have tested the latest master and develop branches on an Ubuntu 14.04 virtual machine with 16G of memory. The largest file that we can open successfully is 1.6G; a 2.1G file causes the application to crash. If we increase the virtual machine's memory to 32G, we can open the 2.1G file, but a 6.4G file crashes.

I think I have tracked this down to DataSource.cpp -- similar recent code in the master and develop versions is making a copy of the data, and the crash appears to occur during the copy, possibly because the vectors the data is being copied into have to be resized and exceed the available memory. I am attaching two backtraces -- one for the develop build and one for the master build.

I have not yet tried building older versions to find the exact revision where this starts happening, but I can try to narrow it down more tomorrow.

develop_backtrace.txt master_backtrace.txt

daikema commented 8 years ago

Here's a link to a couple of the files that we've been using for testing:

low-sky commented 8 years ago

@daikema @confluence Thanks for working on this particular shortcoming.

confluence commented 8 years ago

Part of the problem seems to be this commit -- adding another vector for the indices seems to push the memory usage up enough to make the 2.1G file crash. I have verified that a build of the previous revision displays the 2.1G file -- but it crashes on the 6.4G file, in the same place. So there is a deeper problem with this copying code.

65fd57e_backtrace.txt

confluence commented 8 years ago

Preallocating space for these vectors allows me to open the 2.1G file. If I open the 6.4G file, the process uses up all the available memory and gets killed -- but this is more likely to be hitting the actual memory limitation (the original file, plus a copy of the values, plus a copy of the indices could plausibly be adding up to > 16G).

I am going to make a pull request with the preallocation code, and also look into the possibility of replacing these calculations with more space-efficient algorithms for very large files.

confluence commented 8 years ago

Since my PR has been merged, and the outstanding problems with use of this quantile algorithm on large images are outside the scope of this bug, I am closing this issue.