sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
272 stars 137 forks source link

Error in model evaluation #338

Open Monica9577 opened 4 months ago

Monica9577 commented 4 months ago

Describe the bug When trying to create evaluation video, the main console gives back an error saying there is a mismatch in the number of features in input file.

To Reproduce After succesfully label the classifier. in "run machine model window".

Captura de pantalla 2024-02-24 191231

Expected behavior Once complete, you should see a video file representing the analyzed file inside the project_folder/frames/output/validation directory

Desktop (please complete the following information):

Additional context Is my first time creating a new model, so maybe I'm doing something wrong, but I followed the tutorial on this github page. Anyway, I apologize in advance if it was my mistake and not the software's. thanks in advance for your help

sronilsson commented 4 months ago

Hi @Monica9577 ! SimBA takes all your files inside the project_folder/csv/targets_inserted directory and build a model from these files. In each one of these files, if you remove the body-part data columns (in the beginning) and your annotations (in the end of the file), you are left with 221 columns of features.

Next, you want to use this model .sav file, that is built using 221 columns of features, on new data inside your project_folder/csv/features_extracted directory. SimBA goes ahead and opens the first file inside the project_folder/csv/features_extracted directory, and tries to analyze it. However, it finds 245 columns. It doesn’t know what to do, as the model was trained with 221 columns, and it now sees 245 columns - what should it do with all these extra columns? So it give you the error.

One possible way that could cause a mismatch in the number of columns is, for example, you added ROI features to your new data inside the project_folder/csv/features_extracted directory, but you did not add it to the files you used to train the model with - could this be possible?

Monica9577 commented 4 months ago

No, I haven't add any ROIs to any of the files... But I'm going to check if there's any difference between the features csv of the video I used to train the model and the one I'm using to evaluate it Thanks!

sronilsson commented 4 months ago

Thanks!

One more potential reason for this error I have seen before: Within a single SimBA project, be sure you are working with the same Animal names and the same body-part names in all files.

The file project_folder/logs/measures/pose_configs/bp_names/project_bp_names.csv stores the names of your body-parts in your SimBA project. Before training models, SimBA drops the data for your body-parts - we don't want to use the locations of the body-parts in any model. However, if the body-parts names or animal names for any reason changes across data files, then the appropriate columns will not be recognized at body-parts and SimBA will fail to drop them in some files, causing the mismatch in the column numbers errors as you see.

Monica9577 commented 4 months ago

Hi Yes I'm using the same body parts and animal names in every video, but is still giving problems

sronilsson commented 4 months ago

@Monica9577 - if you look at the a file inside the project_folder/csv/targets_inserted directory, and compare it against a file inside the project_folder/csv/features_extracted directory, what differences to you see in column names and the number of columns? You could also zip up and share a file from each directory here with me or through gdrive link and I can look?