sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
272 stars 137 forks source link

SIMBA INDEX WARNING: Some frames appears to be missing in the dataframe #353

Open MohamedAlyEbraheemZahran opened 3 months ago

MohamedAlyEbraheemZahran commented 3 months ago

I am using 2 animals project - user defined labeling - 8 bodyparts per animal side view camera When I run the model and then run interactive probability plot (or run models after models setup) I can see the error: bodypartcolumnnotfound 315261352-7682699a-ec56-4175-a9b8-dc8385befd9b

however interactive probability plot is working and after running models and seeing the error i still can analyze machine predictions and having csv files The problem is that when I try to create video of sklearn_results or GANNT plot I see the following error ("None of [Index(['Ear_1_x', 'Ear_1_y', 'Ear_1_p'], dtype='object')] are in the [index]",) "None of [Index(['Ear_1_x', 'Ear_1_y', 'Ear_1_p'], dtype='object')] are in the [index]" SIMBA INDEX WARNING: Some frames appears to be missing in the dataframe and could not be created Video TT ETH2 CTR4_basal_males saved...

And the saved video is corrupted (1kb size)

315261401-3a21de68-a395-423a-840e-d3d3e601342d 315261432-da555f31-2743-4799-8785-e28a268e8df7

OS: WIN11 Python Version [e.g. 3.6.0] Are you using anaconda? yes

sronilsson commented 3 months ago

Hey @MohamedAlyEbraheemZahran! Thanks for the screengrabs very helpful. Looks like we are having troubles finding the data for the Ear body-part for the first animal at least in project_folder/csv/machine_results/TT ETH2 CTR4_basal_males.csv file. I saw this happening recently where the animal names had changed across sequential runs.

If you open that file, what headers do you see? E.g., is the Ear body-part columns actually named say Animal_1_name_Ear_1_x or something else?

Simon

sronilsson commented 3 months ago

PS. Which version of simba do you have pip show simba-uw-tf-dev and I'll see if I can recreate it.

MohamedAlyEbraheemZahran commented 3 months ago

Hi Simon,

I am using Simba 1.87.2 (I tried 1.55.9 and 1.87.5 and it is the same problem) I have 2 animals in tube test one on the left and one on the right and I named the animals in the project: Left and Right In machine results csv files the ear body part named: Left_Ear_1_x

NB: othe projects with topview one animal 8 bodyparts or 2 animals top view 8 bodyparts are fine so it seems this problem only in user defined labeling

sronilsson commented 3 months ago

@MohamedAlyEbraheemZahran Yes, what should happen - when you import the data from two animals user-defined body-part config, and name them Left and Right, is that SimBA should open up the project_folder/logs/measures/pose_configs/bp_names/project_bp_names.csv file and add prefixes representing your chosen animal names, Left_ and Right_ to the appropriate rows.

i) For troubleshooting, if you add the prefixes Left_ or Right_ to each row, so e.g., the first row reads Left_Ear_1 etc, does it work?

ii) For the files representing the same data prior to machine learning inference, e.g., project_folder/csv/features_ecxtracted/TT ETH2 CTR4_basal_males.csv does it still read Left_Ear_1_x or is it Ear_1_x in those files?

MohamedAlyEbraheemZahran commented 3 months ago

ETF2 CTR3 _48 Female.csv I am attaching one file of the machine results

MohamedAlyEbraheemZahran commented 3 months ago

CTR1 ETF2 48 Female.csv This is an example of feature extracted (I think it is the same as machine results) As for the project_folder/logs/measures/pose_configs/bp_names/project_bp_names.csv I usually open it an delete the added prefix otherwise it will give me an error of mismatched features number

MohamedAlyEbraheemZahran commented 3 months ago

CTR1 ETF3 48 Female.csv This is an example of target inserted files

sronilsson commented 3 months ago

Thanks @MohamedAlyEbraheemZahran - it seems like in all of the files, the animal body-parts are represented by the Left_ and Right_ prefix. However, in the project_folder/logs/measures/pose_configs/bp_names/project_bp_names.csv file, which lists the body-part names that SimBA expects, there are no such left/right prefixes.

One question: When you “open it an delete the added prefix” and it works, do you see any warnings named BodypartColumnNotFoundWarning like the orange text in the first screen grab you sent?

MohamedAlyEbraheemZahran commented 3 months ago

Actually, the orange text in the first screen grab (BodypartColumnNotFoundWarning) is when the prefixs are deleted because if I readded the prefixs it will give me Mismaching features numbers (probably add more columns for the preffixs)

MohamedAlyEbraheemZahran commented 3 months ago

image image This an example of having the preffix in the project_bp_names.csv

sronilsson commented 3 months ago

Got it - did you see the orange warning msg when you trained the model?

What I am thinking is that the model mistakenly was trained using 801 features (with the body-parts coordinates included mistakenly). Next, when you add the Left_ and Right prefixes, SimBA drops the body-part columns which give syou the correct 753 columns, but the model was mistakenly trained with 801 features, and thats why the error shows?

sronilsson commented 3 months ago

There was a slighly related issue HERE the other day. It's not the same solution, but I described what it means when we drop the body-part coordinates prior to training the model.

MohamedAlyEbraheemZahran commented 3 months ago

and what could be a solution for this?

MohamedAlyEbraheemZahran commented 3 months ago

one thing is that in the target inserted files the contained tow classifiers that I deleted from the project later because are not present enoughly in the videos. Do you this could be the reason? and how to fix this?

MohamedAlyEbraheemZahran commented 3 months ago

Now I added the two classifiers again to the project and rerun training for one classifier and i can see the same orange warning during training

sronilsson commented 3 months ago

I think this:

i) Add the left and right prefixes again to your project_folder/logs/measures/pose_configs/bp_names/project_bp_names.csv file.

ii) Retrain the models (create new .sav files) for your behaviors.

iii) Run your models again on the new data and see if it persists, or send me any screengrab of any error you see.

I don't think it is related to the deleted columns from the single classifier. It complains about 48 columns (801-753) which is the exact number of body-parts that you have - with a x, y , p value columnd for each: 48 / 3 = 16. Meaning the body-part columns where mistakenly used to train the classifier (I think).

MohamedAlyEbraheemZahran commented 3 months ago

Thanks alot this is what I have just tried; I retrained one classifier with the added prefix and the orange message didn't appear and I run this classifer and again the warning didn't apppear. I will do the same for all classiferes renalyze the data and try if visualization will work or not? And I will let you know. Do you think the results of analyze machine pedictions would be different?

sronilsson commented 3 months ago

I am not sure - but probably not.

A bit of background:

We typically don't want to input the body-part coordinates directly when training a model: we don't want to train models based on the exact location of a body-part in time within the video.

For example, Resistance behavior could potentially happen anywhere in the tube. If you have annotated Resistance behavior only in the left side of the tube, and you train the model including your body-part locations, then the model might pick up on that and use that correlation between your body-part location and annotations, and conclude that Resistance is a behavior that only can happen on the left side. Next, when you have new videos, and restistance happens in the middle of tube, then the model won't score it correctly because it thinks that resistance only happens on the left.

If your behavior do have spatial associations. E.g., it is a fact that Resistance only can happen on the left side (if it happens elsewhere it is not Resistance) then I recommend drawing ROIs and using those ROIs to build features as documented HERE

let me know if this makes sense!

Simon

MohamedAlyEbraheemZahran commented 3 months ago

Thank for your reply Actually I am happy with the classifiers results in this project and they aren't location limited so I don't need ROI But in other projects in which I have an object like a glove or a mirror, It will be benificial for sure. But did you mean that the ROI could be considered as features in a classifier training?

sronilsson commented 3 months ago

Yes.

For example, you could add columns that represent:

1 or 0 value noting if body-parts are located within, or outside, each of the ROIs. The millimeter distance between body-parts and the center of each ROIs. 1 or 0 value noting if the animal is directing towards the center each of the ROIs or not.

These you can get from the GUI menu documented in the link above.

... just a note, users have had other questions requiring computing all kinds of different values representing the animal and its relationship with ROIs. I have written these methods HERE so if something specific you want to calculate it can probably be done

sronilsson commented 3 months ago

Ps. like... might be useful to know how animal geometries are overlapping etc in tube test

MohamedAlyEbraheemZahran commented 3 months ago

Thanks so much for the information. By the way I like the way you represent the Spontaneous alternation