sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
272 stars 137 forks source link

ROI analysis feature extraction #347

Open DanaeNikol opened 3 months ago

DanaeNikol commented 3 months ago

Hello hello!

More of a comprehension question: Once the ROI analysis is completed, the features (for example the percentage of session time the mouse is spending in each ROI) are extracted based on the csv files that the ROI analysis produced?

I would like to use the mean of my earL and earR detections for a more reliable estimation (nose does not look so good). For this I have run the ROI analysis for both earL and earR for both animals (black and white) and have created a file where my body-part column is the concatenation of both body parts (EarLEarR) and the detections the mean of EarLEarR. This I have saved in the logs folder. I have rof course removed the original movement and time_data csv files from the same folder.

However, once I want to extract the ROI features and append them by animal to the already extracted features, of course I need to select the body parts that I would like to work with. And here comes my question above, will it take into account my mean_csv file that is in the logs folder or will run some part of the ROI analysis again?

And if so, how could I manage to use the mean of the two body parts as I described?

Thanks so much!

sronilsson commented 3 months ago

Hi @DanaeNikol!

The body-parts that are expected in your project, and you see in the drop-down, are listed in the project_folder/logs/measures/pose_configs/bp_names/project_bp_names.csv of your SimBA project - you could open that file, and add EarLEarR to the list, save the file, open your project again, and you should see EarLEarR in the dropdown but please let me know.

Note:

An alternative approach to avoid running the ROI analysis several times, is to create a new body-part (e.g., "head") that lies on the 50th percentile between the left and right ear to all your files inside the simba project using THIS function.

image

I can give you some example python code that would do it if you wanted to go this route.

DanaeNikol commented 3 months ago

Hi @sronilsson !

I does work! But then this means that as input for the feature extraction the csv file in the logs folder will be used? Meaning the mean file I created?

Could you indeed also provide some example python code, if that would be easy for you?

DanaeNikol commented 3 months ago

However, when I try to run the "Append ROI features" I bump into this error: image

I would guess then that the other alternative would actually be the only way, else I should already manipulate the deeplabcut output I import in simba, right?

sronilsson commented 3 months ago

Yes, about the "Append ROI features" error first:

When clicking to "Append ROI features", SimBA reads in all your files inside the project_folder/csv/outlier_corrected_movement_location directory, and checks each file that it reads in has the correct number of columns. What the "correct" number of columns is, is dictated by how many body-parts you track, which is stored in the project_folder/logs/measures/pose_configs/bp_names/project_bp_names.csv file. If the project_bp_names.csv file has 18 body-parts listed, the "Append ROI features" function anticipates 18x3 = 54 (with an x, y and p column for each body-part) columns in each file in the in the project_folder/csv/outlier_corrected_movement_location directory.

The error message above suggests that 3213_Scan11-converted-converted_2.csv fails this check - this file only has 48 columns. Perhaps it is related to adding the EarLEarR or other body-parts to the project_bp_names.csv without the relevant data existing in the 3213_Scan11-converted-converted_2.csv file?

DanaeNikol commented 3 months ago

Yes, this is absolutely true. There is not such body part in the deeplabcut csv input and everything has been run up to the ROI analysis without the EarLEarR bodypart. So the error is actuallly correct, I am just not sure if there could be any workaround (?)

sronilsson commented 3 months ago

There are always a few workarounds :) Just so I understand and we are on the same page - do you want to use EarLEarR when Appending ROI features, or have you finished the analysis with this body-part and now want to drop it?

DanaeNikol commented 3 months ago

So my initial body parts(also used in deeplbacut processing) are: Ear_left_1 Ear_right_1 Nose_1 Center_1 Lat_left_1 Lat_right_1 Tail_base_1 Tail_end_1 Ear_left_2 Ear_right_2 Nose_2 Center_2 Lat_left_2 Lat_right_2 Tail_base_2 Tail_end_2

I have run the ROI analysis using Ear_left_1 & Ear_left_2, and Ear_right_1 & Ear_right_2. Because I want a more reliable measurement, I would like to extract the percentage of the session time that each mouse spends in each ROI with the mean of Ear_left and Ear_right for both animals. For this, I manipulated the csv outputs I got from Simba ROI analysis and calculated the mean, composing another csv file saved in the logs folder.

Now, I want to Append the ROI features (to get the percentage of session time in each ROI) using the mean of Ear_left and Ear_right I calculated.

So I guess, to answer your question, I want to use this body part only to Append ROI features (All this so that my analysis is comparable to a previously done by a colleague)

sronilsson commented 3 months ago

Thanks @DanaeNikol, got it - just one last question so I understand - when you say the "the mean of Ear_left and Ear_right" you mean the mean of the left ear and right ear pose-estimated coordinates (so the top of the head)?

sronilsson commented 3 months ago

One more thing: are you only after the percent of time spent in each ROI, and you are not after creating features for later classifiers correct?

DanaeNikol commented 3 months ago

I have calculated the mean on the level of the movement/velocity/time that is the output of the ROI analysis. I have not manipulated anything on the level of the pose-estimated coordinates. This could also be an option, but it seemed more tricky. The function you suggested would actually go in this direction, right? because then maybe, this could be a very good option, if you could provide the example script.

DanaeNikol commented 3 months ago

For now, we are only after the percentage, but later maybe we want to move further than that.

sronilsson commented 3 months ago

Alright, how I would solve it would probably be to create a new body-part which is located half way between the left ear and right ear, and then work with that. I can send some code and instructions in a bit?

DanaeNikol commented 3 months ago

Sounds very good, thanks for your precious help!

sronilsson commented 3 months ago

First things first - let’s first see if we can create a bunch of files with an additional body-part, with that body-part being half way between the ears, using the attached script?

i) Open the attached compressed file and change the config path, the save directory, and the names of your body-parts, and save the file. You should only have to edit the lines above the “########################” mark. I have added comments to each line, please let me know if something doesn’t make sense.

ii) Activate your SimBA conda environment and navigate to the directory where add_body_part.py is stored, and type python add_body_part.py.

iii) New files, one for each file stored in your outlier_corrected_movement_location directory will be stored in your specified output directory. The files will have additional columns representing your new body-part.

add_body_part.py.zip

DanaeNikol commented 3 months ago

Hi @sronilsson!

Thanks so much for your help:)

Unfortunately the zip file is not found, there is an error. Could you reupload it?

sronilsson commented 3 months ago

Hmm doesn't want to play a long for some reason.. can you try this gdrive link? https://drive.google.com/file/d/1RkqYIWMm4WXyIOKkLF5p8T8Up5-nc1Fo/view?usp=sharing

DanaeNikol commented 3 months ago

Everything works perfectly! Thank you so much @sronilsson !

sronilsson commented 3 months ago

Great @DanaeNikol ! Are you OK for the next steps? I was thinking:

i) Move your new files, containing the new extra, body-part data, to your SimBA project and the project_folder/csv/outlier_corrected_movement_location directory.

ii) Add the new body-part name to your SimBA project body-part list in the project_folder/logs/measures/pose_configs/bp_names/project_bp_names.csv file.

iii) Proceed to compute ROI features by selecting the new body-part name from the dropdown in the GUI?

DanaeNikol commented 3 months ago

All clear! I proceeded with everything and it all seems to be going well for now! Thanks a lot for the precious help!

One more short question: Are the ROI definitions saved somewhere? Is it the h5 file in the '/logs/measures/' folder? For example if I want to run another ROI analysis for different ROIs can I backup my currently defined ROIs somewhere to potentially use them again at some point for further analyses?

sronilsson commented 3 months ago

Yes - the ROI definitions are stored in a file at location project_folder/logs/measures/ROI_definitions.h5.

Also, if you wanted to convert this h5 file to a CSV file and see what is in it, you can do that with this pop-up documented HERE.

Let me know if anything else comes up!

DanaeNikol commented 3 months ago

Thank you @sronilsson !!

One more question then (hehe), would it be possible to make some changes in the csv file of the ROI definitions and then convert it back to the h5?

sronilsson commented 3 months ago

Hi @DanaeNikol ! Yes you could, but again I don't have a graphical interface for it, and it would have to be code. We need to chunk up the CSV files into ha new h5 file that has three dataframes inside of it for circles, polygons, and rectangles. The dataframes can be empty if a specific shape-type doesn't exist, but at least an empty entry has to be there. I can give you the code for it if that helps?

DanaeNikol commented 3 months ago

If that would be easy to share, it would be of great help I think!

sronilsson commented 3 months ago

Np - it would be something like this, again you just have to edit the lines above the ##... to your paths.

It takes a bit of wrangling to get the data represented as strings in the CSV into the appropriate formats.

Let me know how it goes please!

roi_definition_csvs_to_h5.py.zip

sronilsson commented 3 months ago

Also let me know if it downloads OK or if you need a gdrive link

DanaeNikol commented 3 months ago

I works! Thanks a lot @sronilsson!

DanaeNikol commented 3 months ago

And one other question, in the ROI analysis, is it possible to have ROIs that overlap with each other in the same ROI analysis round? For example if I have many square ROIs covering my cage and then one big for the whole cage, would this be a problem? Or should it be okay?

sronilsson commented 3 months ago

@DanaeNikol that's no problem, ROIs can overlap however you wish, and could contain wholly or partly shared regions