sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
287 stars 140 forks source link

Importing Boris Data #102

Open neurowookie opened 3 years ago

neurowookie commented 3 years ago

Hello,

We are testing out whether we can append pre-scored data from Boris within Simba but it appears that when we load it in the targets_inserted CSV only shows zeros for our behaviours. Although it's simple enough to write a script ourselves, I was wondering if maybe we are doing something wrong. Our behaviors have the same name and the video's names match as well. I see the script skips over behavior category, subject, and comment so I'm guessing those are not important. I look forward to hearing from you.

neurowookie commented 3 years ago

Ok I found the issue. I used a .avi file while the script is setup to remove only .mp4 extensions from the base filename. Just changing the extension in my Boris CSV made it work. Maybe it would be better to use os.path.splitext(base)[0] I also noticed that it currently seems to skip the first 15 rows to get to the header row, if people annotated their videos using two videos(side and top for example) at the same time, this will add one more row before the header. I have told my colleagues who are using these files to remove one row for the import to work but there should be a more elegant solution.

sgoldenlab commented 3 years ago

Thanks for letting me know @neurowookie - I admit I have not seen too many Boris-output formats, so it is rather hardcoded to the situations of the people I've talked to, but this gives me an opportunity to make it more flexible.

Would you mind sharing a single Boris output file?

How do you use the behavior category, subject, and comment fields in you boris output? Are they important for distinguishing which behavior the animal is doing?

I will insert the mp4/avi fix.

sgoldenlab commented 3 years ago

Do you think this attached script would work? It assumes the data starts when the first column reads Time. Not sure of there are cases when that is not true.

append_boris.py.zip Untitled

neurowookie commented 3 years ago

Thanks for the quick response.

First to answer your first set of questions we don't generally use behavior category or comment so that isn't something to worry about. We do however use subject sometimes when we have two animals to define which animal is doing what but for what we currently are trying to use SIMBA for this is unimportant and probably complicated to import in a useful way.

I had a go with your script and it didn't work. But I really appreciate all of your work so I took some time and got it working. I'm not sure how pythonic my solution is though. Essentially once you import the whole CSV as data frame it sets Observation id and the following cell as the headers so when you try to run the skip it doesn't work. So I just reimport after finding the index. Also in order for splitext to work properly you should already get the basename. Here is the entire section that is working now on my pc. `
BorisOutput.xlsx

    currDf = pd.read_csv(file)
    index=(currDf[currDf['Observation id']=="Time"].index.values)
    currDf = pd.read_csv(file,skiprows=range(0, int(index+1))) 
    currDf = currDf.loc[:, ~currDf.columns.str.contains('^Unnamed')]
    currDf.dropna()
    currDf.drop(['Behavioral category', 'Comment', 'Subject'], axis=1, inplace=True)
    for index, row in currDf.iterrows():
        currPath = row['Media file path']
        currBase= os.path.basename(currPath)
        currDf.at[index, 'Media file path'] = os.path.splitext(currBase)[0]
    combinedDf = pd.concat([combinedDf, currDf])

` I have attatched a Boris output with 2 videos. These are fake observations solely for testing.

sgoldenlab commented 3 years ago

@neurowookie No your way is way more pythonic, mine was a crappy for loop lol :) I have updated simba pypi version to have your code, and also appended the script here. Let me know if it works.

append_boris.py.zip