evaluation problems with cropped trainingset when individual1 is empty

backyardbiomech commented 3 years ago

This could be a bug if the behavior can be confirmed.

OS: Windows 10 DeepLabCut Version: 2.1.9 Anaconda env used: DLC-GPU, running in a custom jupyter notebook.

Describe the problem

I'm working on a difficult multianimal data set and am in an analyze-refine-train loop. I have 1-3 individuals present in the training videos. I'm cropping my labeled images to 10 600x600 when I create the trainingset. On one loop after training, evaluation results and analysis were worse than the previous iteration, so I looked through the evaluation images (I can make them now thanks to the more efficient plotting!).

Something fails without clear error reporting if the first individual is cropped out of a specific cropped image, and only labels for individual2 are present. I imagine the same problem could occur if the first individual is deliberately left blank.

The problems I'm having are in the test images, but I'm also concerned about how training may be affected.

Details The image below from the test set shows detected points all "roughly" where they are supposed to be. However, there are no "+" markers to indicate where the manual labels should be. Specifically, the orange marker in the lower right is clearly too low and to the right of the manual marker (second screenshot), and the blue marker is above and left of the manual marker (second screenshot). This pattern is not consistent, however; most images have all expected markers ("dots", "+", and "X"), but if a "+" is missing, they are all missing.

This image was luckily included in the test images from the previous iteration (screenshot below), and it had the same problem, so it's clearly some problem with the data.

With the last iteration, the multianimal crossvalidate returned NaN for all of the test parameters:

Saving optimal inference parameters...
   train_iter  train_frac  shuffle  rmse_train  hits_train  misses_train  falsepos_train  ndetects_train  pck_train  rpck_train  rmse_test  hits_test  misses_test  falsepos_test  ndetects_test  pck_test  rpck_test
0     50000.0        90.0      1.0    2.738722         7.0           0.0             0.0             1.0   0.928571     0.87303        NaN        NaN          NaN            NaN            NaN       NaN        NaN

My best guess as to the cause is that this animal is marked as individual2, and individual1 is completely empty in this crop. I can confirm that in an earlier frame from this movie, when individual1 bodyparts are marked, all markers are present.

I'm wondering if this "marker" issue is also causing havoc during cross-validation, and even during training. However, in the previous iteration, when the above frame was also in the test group, I got actual values for test during crossvalidate (but pafthreshold was still validating an order of magnitude too high to get good tracklets for these data).

The only errors I get that may or may not be a signal of a problem is a bunch of all-NaN slice runtime warnings, (e.g below), but I had assumed most of those are for crops where no animal is visible:

128it [00:26,  4.99it/s]C:\Users\jacksonbe3\Miniconda3\envs\DLC\lib\site-packages\deeplabcut\pose_estimation_tensorflow\evaluate_multianimal.py:275: RuntimeWarning: All-NaN slice encountered

In this data set, with that size crop, it's possible that up to ~5% of the cropped images have only "individual2", so I imagine that whether those images get shuffled into test or train could have a big impact on things.

backyardbiomech commented 3 years ago

Actually, images where all points for individual1 were marked still show no "+" for individual2, so maybe that's just a plotting issue if you can confirm that individual2 is included in evaluation. But that still leaves the question of why I got all NaN for test parameters following evaluation.

AlexEMG commented 3 years ago

Dear @backyardbiomech -- thanks for raising this!

I could reproduce the plotting error. But, fortunately (at least in my example) the data are not wrongly removed when cropping (so I don't think that the labels are removed, even in your case?). So, indeed I do believe this is "just" a plotting issue. Will look into it tomorrow.

backyardbiomech commented 3 years ago

@AlexEMG, Thanks for looking in to it. I figured it was just a plotting issue, but the all NaN's for the evaluation of test images was disconcerting.

With recreating a trainingset, retraining, and evaluating, the all NaN results for the test changed to reasonable numerical values, but the plots show the same problem.

AlexEMG commented 3 years ago

Tested on some example data, where I had the same problem. Test-montblanc_cropped-img0115c0 Best wishes from Mont Blanc!

DeepLabCut / DeepLabCut

evaluation problems with cropped trainingset when individual1 is empty #1084