sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
273 stars 137 forks source link

SIMBA FEATURE NUMBER MISMATCH ERROR #276

Closed 13281306705 closed 10 months ago

13281306705 commented 11 months ago

Hello, I'm sorry to bother you. When I clicked on 'Run Model' under the 'Run machine model' option in simba, I encountered an error: SIMBA FEATURE NUMBER MISMATCH ERROR: Mismatch in the number of features in input file D:/simbaAndDeepLabCut/123/loxy/models/generated_ Models/run. sav, and what is expected by the model run The model expects 153 features The data contains 155 features. Do you know what caused this?

sronilsson commented 11 months ago

Hi @13281306705! Do you have an error msg from the operating system terminal?

This error happens when you build a model using the data files inside the project_folder/csv/targets_inserted directory, and each file in this directory contains 153 features (all columns minus the annotation columns and the body-part columns).

You then try to run the model on your data inside the project_folder/csv/machine_results directory, and SimBA finds 155 columns in a file. Do you see two columns that may have sneaked in by mistake in a project_folder/csv/machine_results file?

Note: The error msg reads, SIMBA FEATURE NUMBER MISMATCH ERROR: Mismatch in the number of features in input file D:/simbaAndDeepLabCut/123/loxy/models/generated_ Models/run. sav It is a error msg bug, which I will fix now, and shouldn't affect the program or cause the error. It should read *SIMBA FEATURE NUMBER MISMATCH ERROR: Mismatch in the number of features in input file TheVideoFileNameYourTrying toAnalyze

sronilsson commented 11 months ago

... is it possible you selected a file to use in validation that lives in project_folder/csv/targets_inserted rather than in project_folder/csv/features_extracted ?

13281306705 commented 11 months ago

... is it possible you selected a file to use in validation that lives in project_folder/csv/targets_inserted rather than in project_folder/csv/features_extracted ?…您是否选择了一个位于 project_folder/csv/targets_inserted 而非 project_folder/csv/features_extracted 中的文件用于验证?

Hello, this is the detailed error message: During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "d:\anaconda3\envs\simba\lib\tkinter__init.py", line 1699, in call return self.func(*args) File "d:\anaconda3\envs\simba\lib\site-packages\simba\SimBA.py", line 372, in button_runvalidmodel = Button(label_model_validation, text='RUN MODEL', fg='blue', command=lambda: self.validate_model_first_step()) File "d:\anaconda3\envs\simba\lib\site-packages\simba\SimBA.py", line 566, in validate_model_first_step clf_path=self.modelfile.file_path) File "d:\anaconda3\envs\simba\lib\site-packages\simba\model\inference_validation.py", line 53, in init__ output_df[probability_col_name] = self.clf_predict_proba(clf=clf, x_df=data_df, model_name=classifier_name, data_path=clf_path) File "d:\anaconda3\envs\simba\lib\site-packages\simba\mixins\train_model_mixin.py", line 971, in clf_predict_proba raise FeatureNumberMismatchError(f'Mismatch in the number of features in input file {data_path}, and what is expected by the model {model_name}. The model expects {str(clf.nfeatures)} features. The data contains {len(x_df.columns)} features.') simba.utils.errors.FeatureNumberMismatchError: Mismatch in the number of features in input file D:/simbaAndDeepLabCut/123/lxy/models/generated_models/run.sav, and what is expected by the model run. The model expects 153 features. The data contains 155 features.

After listening to your description, I roughly understand the reason for the error. Indeed, I chose project folder/csv/targets Inserted for validation.Thank you very much for your answer.

sronilsson commented 11 months ago

Got it! Yes, its likely that those files contain two additional columns (your behavior annotations) which we don't want when running validation.