umyelab / LabGym

Quantify user-defined behaviors.
GNU General Public License v3.0
64 stars 5 forks source link

killed process with full video analysis #87

Closed vzimmern closed 5 months ago

vzimmern commented 8 months ago

Good afternoon,

Three questions about this excellent software:

  1. How should I interpret the table that emerges when training and testing categorizers? Specifically, the table reports that one of my behaviors has a precision of 0.37 after the training process and that number changes to 0.5 when testing the categorizers. How should I interpret these numbers? How should I interpret the f1-score and the support score?

  2. I am trying to identify a rare behavior in a mouse. For this rare behavior, can I combine behaviors from multiple different videos (with different mice) to get a higher number of behaviors for the categorizer to train on?

  3. I have created a categorizer and am evaluating an hour of video recording of a single mouse. The process seems to die whenever the categorizer starts being used. The last time the process was killed it was because of lack of memory -- but this time, I have checked the memory and there's plenty of RAM left (~ 25 GB left) while SWAP memory is maxed out at 2 GB.

Here's a screenshot of the terminal output preceding the process kill.

Thanks again for the help!

Screenshot from 2023-12-24 16-42-33

yujiahu415 commented 8 months ago

Hi,

  1. The metrics of training and testing may be different because they are done on different samples. The more samples used for calculating these metrics when doing training and testing, the more consistent of the metrics between training and testing. The supporting number is the sample number and the f1 score is the weighted average of precision and recall (sensitivity).

  2. Yes, the Categorizer should be able to generalize on different videos with different mice.

  3. The error indicates your system ran out of memory. When the Categorizer does the behavioral classification, it requires additional memory. Since the duration of your video to analyze is pretty long, 25GB might not be enough. You may either increase the system memory or trim the video into shorter clips.

vzimmern commented 8 months ago

Thanks for these answers.

I used a smaller video size (10 minutes) and I get the following error. My debugging didn't get me very far.

Capture d’écran 2023-12-25 à 10 42 19 AM

Thanks again for all the help.

yujiahu415 commented 8 months ago

This error seems wired. It might be caused by an outdated tensorflow/keras. You may first update tensorflow to v2.10.0 by typing: python3 -m pip install tensorflow==2.10.0, and then see if you still get this error. Thanks!

vzimmern commented 8 months ago

Tensorflow/keras is the 2.10 version in the conda environment. In the base environment, tensor flow is version 2.12.0

Any other ideas?

Capture d’écran 2023-12-27 à 8 01 01 PM
yujiahu415 commented 7 months ago

Hi, sorry @vzimmern I missed your message previously. What is your CUDA toolkit version and what is your CUDNN library version? I have tried tensorflow==2.10.0 with CUDA toolkit version==11.7 or 11.8 and the CUDNN library supporting CUDA 11.X. It worked well. These two CUDA / CUDNN versions are also compatible with PyTorch==2.0.1 if you want to use the Detector module in LabGym. Apologies again for the late response.