MemoryError when calculating average paws

ivoflipse commented 10 years ago

I was working with just 3 measurements and I already got a MemoryError using 32 bit Python. The app seems to be using nearly 1 Gb of RAM, which seems like an awful lot for just three measurements...

I can't image the array I'm allocating being the problem, since its size isn't huge compared to say Zebris's treadmill measurement. Still, this seems to indicate that there are a bunch of memory leaks or objects which aren't deleted, even though I no longer use them. It would be great if I could find them and eradicate them.

Stacktrace:

Traceback (most recent call last):
  File "C:\Dropbox\Development\Pawlabeling\pawlabeling\widgets\processing\processingwidget.py", line 233, in select_left_hind
    self.next_contact()
  File "C:\Dropbox\Development\Pawlabeling\pawlabeling\widgets\processing\processingwidget.py", line 304, in next_contact
    self.update_current_contact()
  File "C:\Dropbox\Development\Pawlabeling\pawlabeling\widgets\processing\processingwidget.py", line 186, in update_current_contact
    contacts=self.contacts)
  File "C:\Anaconda\lib\site-packages\pubsub\core\kwargs\publisher.py", line 30, in sendMessage
    topicObj.publish(**kwargs)
  File "C:\Anaconda\lib\site-packages\pubsub\core\kwargs\publishermixin.py", line 24, in publish
    self._publish(msgKwargs)
  File "C:\Anaconda\lib\site-packages\pubsub\core\topicobj.py", line 340, in _publish
    self.__sendMessage(data, self, iterState)
  File "C:\Anaconda\lib\site-packages\pubsub\core\topicobj.py", line 359, in __sendMessage
    self._mix_callListener(listener, data, iterState)
  File "C:\Anaconda\lib\site-packages\pubsub\core\kwargs\publishermixin.py", line 64, in _mix_callListener
    listener(iterState.filteredArgs, self, msgKwargs)
  File "C:\Anaconda\lib\site-packages\pubsub\core\kwargs\listenerimpl.py", line 27, in __call__
    cb(**kwargs)
  File "C:\Dropbox\Development\Pawlabeling\pawlabeling\models\model.py", line 298, in update_current_contact
    self.calculate_average()
  File "C:\Dropbox\Development\Pawlabeling\pawlabeling\models\model.py", line 316, in calculate_average
    normalized_data = utility.calculate_average_data(data)
  File "C:\Dropbox\Development\Pawlabeling\pawlabeling\functions\utility.py", line 60, in calculate_average_data
    padded_data = np.zeros((num_contacts, mx, my, mz))
MemoryError

ivoflipse commented 10 years ago

When the app opens, it just uses a meager 60 Mb, but as soon as I load the measurements, it sky rockets to 550 Mb.

Disabling the call to model.load_contacts, seems prevent this, which indicates it (logically) has to do with the contacts. By turning off model.calculate_average, I only get a modest bump in memory, indicating there's no problem with the contacts per se, but with the average I create out of them.

Ironically, this is exactly what the stack trace indicates, but that could have just been the last drop that overflowed the bucket.

Obviously duplicating whatever data I have isn't the culprit either, doubling something small isn't the issue. Hence the problem must be in utility.calculate_average_data. Which is to be expected, given I made it horribly, horribly inefficient.

Given I didn't feel like calculating the max size over an entire session, updating it after assigning every step, I made it pre-allocate something horrendously huge (100, 100, 200), so it would fit just about anything. (As an aside: this fits almost the entire measurement!) So I guess there's no other way than bite the bullet and trim this down to a more sensible size and/or use a scheme to increase the size sensibly, instead of pre-allocating the hell out of it.

ivoflipse commented 10 years ago

Letting it recalculate, I get something around 90 Mb memory vs 900+ before, talk about saving memory... Guess that makes it case-closed.

ivoflipse / Pawlabeling

MemoryError when calculating average paws #58