umyelab / LabGym

Quantify user-defined behaviors.
GNU General Public License v3.0
58 stars 5 forks source link

killed process despite memory upgrade #121

Closed vzimmern closed 3 months ago

vzimmern commented 4 months ago

Good morning,

I'm trying to train a categorizer for a mouse behavior project. I keep running into a memory problem that leads to the process being killed. I upgraded my computer's memory from 32 GB to 64 GB, but continued to have the same problem of excessive memory demand during the augmented example phase of the categorizer training. I increased the memory to 128 GB and the same problem happens. The augmented example data gets to 400000 - 440000 and then the memory (swap, cache, and drive) gets used up, leading to a hand and process kill.

I see from the details of your paper that you tested your code on machines that were 32 - 64 GB in memory. Why am I having all these difficulties with a 128 GB machine? I've used different neural network complexity levels (2 and 5 respectively) and also tried without using augmented methods for validation, but the result is the same -- the memory rises progressively until there's no memory left and the process crashes.

I would really appreciate some help. For reference I am using a Linux Ubuntu 22.04 LTS system with the most recent iteration of LabGym. My computer features a Nvidia GEForce RTX 3090 system.

yujiahu415 commented 4 months ago

Hi,

400,000 - 440,000 seems a LOT of examples. How many behavior categories in total? And how many pairs of examples (one pair contains one animation and one pattern image), approximately, for one category before augmentation? And what is the input shape for the Animation Analyzer and Pattern Recognizer? What is the duration (how many frames) for each behavior example? These information will help me to provide feedbacks on how to address this issue.

In our previous paper, the behavior examples were mostly 15-20 frames, the input shape was mostly 32 or 64, and the number of behavior examples for one category was around 200, and the total number of augmented examples were about 10,000-50,000, which is 10 fold less than yours. That was why the 32 / 64 GB RAM with some additional paging on the hard disk could handle.

vzimmern commented 4 months ago

There are 8 behavior categories in total. Before augmentation, there's a wide variety of pairs of examples. Here's the exact breakdown for our mouse recordings:

Grooming - 2,150 pairs Rearing - 120 pairs In place - 11,025 pairs Myoclonus of the head - 161 pairs Myoclonus of the body - 79 pairs Sniffing - 3,368 pairs Orientating - 416 pairs Unknown - 4,817 pairs

The input shape for the animation analyzer and pattern recognizer is 32 x 32 x 1 for animation analyzer and 32 x 32 x 3 for the pattern recognizer. Each behavior example is 60 frames (2 seconds at 30 frames/second). STD used was 50. Complexity level of the categorizer was 2.

After sending you my first message, I read through the methods very closely and saw that augmentation can increase the total data by a factor of 47, so I got rid of augmentation and the memory problem went away.

Nonetheless, I would appreciate your advice here. The categorizer is really struggling to reliably identify the behaviors with the smallest number of pairs (not surprisingly) -- this is the behavior of interest in our research project, namely myoclonus of the head and body. I suppose that there's no way to improve the performance other than to increase the number of pairs of those behaviors (e.g. myoclonus of the head, myoclonus of the body).

Here's the breakdown after testing of this categorizer:

  precision recall f1-score support
grooming 0.73 0.91 0.81 2150
inplace 0.97 0.95 0.96 11025
myoclonus-body 0.5 0.02 0.04 89
myoclonus-head 0 0 0 161
orientating 0.71 0.56 0.63 416
rearing 0.79 0.41 0.54 120
sniffing 0.74 0.83 0.78 3368
unknown 0.98 0.91 0.95 4817
accuracy 0.9 0.9 0.9 0.9
macro avg 0.68 0.58 0.59 22146
weighted avg 0.9 0.9 0.9 22146
yujiahu415 commented 4 months ago

All your settings are good. The 60-frame duration of the examples is the major reason of huge memory consumption. How long is a video to analyze?

I think the main problem here is the huge variation in the numbers of examples for different category: if you choose augmentation, the memory of your system cannot handle that; if you don't choose augmentation, those categories with small numbers would not be learned well because the example amount is too low.

So I suggest you to discard some the examples in "grooming", "in place", "sniffing", and "unknown", to make the number in each category around 500 pairs or even less. This would be an easy fix. You only need to keep examples that are look very different to one another. In this way, the memory of your computer system should be able to handle augmentation, which will significantly increase the amount and diversity of those categories with limited number. The Categorizer training would improve.

vzimmern commented 3 months ago

Is there any way that I can convert my library of 60 frame movies and jpg into 15 frame avi and jpg? I ran into the same memory problem despite using 500 pairs or less in each category, so I assume the problem is with my 60-frame situation.

yujiahu415 commented 3 months ago

You can only use the LabGym preprocessing module to downscale the fps of the original videos (or the 60-frame movies). But the generated jpg file cannot be downsampled and they cannot be excluded from training.

But there are still several ways (they can be combined) you can try to address the memory problem (from easy to difficult):

  1. Make sure you're using paging to allocate hard disk space as virtual memory. And make the free space in your hard drive as much as possible.

  2. When doing augmentation, don't select all of them. You can just use default augmentation methods and choose NOT to augment the validation data.

  3. Reduce the input shape of Animation Analyzer to 16. Keep the input shape of Pattern Recognizer as 32. Increase the complexity level of Animation Analyzer to 3 and that of Pattern Recognizer to 4.

  4. Train a Categorizer with only Pattern Recognizer, and set the complexity level as 4 and input shape as 32. This may or may not reduce the classification accuracy. You can try. Sometimes it actually increase the accuracy, depending on what behaviors you want to classify. If accuracy is not good, try input shape as 64 or even 96. Complexity level can also increase to 5. The Categorizer with only Pattern Recognizer takes way less memory than the Categorizer with both Animation Analyzer and Pattern Recognizer.

  5. Further cut down the amount of training examples, especially for the categories that are easy to distinguish or not behaviors of interest (background), to ~300 pairs. Typically, ~200 pairs of well-selected examples for each category can train a good Categorizer. Well-selected means the examples in each category are sufficiently diverse to cover different scenarios of that behavior in videos to analyze, and the examples in different categories can be easily distinguished by just watching the animations / pattern images.

  6. Generate new behavior examples at duration of 30 frames or even less, and sort them again. If you want to maintain the duration as 2-second actual time, downscale the fps of the original videos to 15 using the preprocessing module.

  7. Add more system memory.

vzimmern commented 3 months ago

This advice is excellent and was very helpful. Using advice #2 and #3, I was able to get it to work without any problems. If this work leads to publication, we would like to add you (Yujia Hu) as an author on the paper, for all your hard work in developing this software and helping us getting through all these computational hurdles.

yujiahu415 commented 3 months ago

Thank you so much for kindly offering me the authorship! But I don't think I qualify that. Helping users troubleshooting is my responsibility. And if LabGym helps your research and publication, just cite LabGym and I'll be very happy. And always happy to be helpful! Thanks again!