Closed kipkeller closed 9 months ago
Thank you for your indepth documentation of your progress. Given that our workflow wasn't clear to you from the app alone, I will use this opportunity to increase the level of detail of the workflow.
For clarity, the "Refine Dataset" step is an optional step to increase your classifier's performance on new unlabeled data by giving you the ability to do "manual active learning". Depending on your classifier's performance it might not be necessary.
The step requires uploading a pose estimation file and corresponding video, which will then use the latest active learning classifier to predict your behavior classes and give you a number of low-confidence samples to label.
The current way to select uploaded video/pose is to use the drop down menu and select the video name.
Because you unintentionally skipped doing that, the next step, "Create new Dataset," is not working.
If the label files have the same name or order as the pose files (and you have many), you can use the folder import to automatically sort them in the selection window.
You are right that the GUI component we use to do this can be tricky with long filenames. Unfortunately, it is a required step for the import of data from various sources given the current split pose/label files.
If you really want to do that this way, please also delete the files generated by the corresponding step, or redo the step, if possible, using the GUI. Otherwise, you might have data that was not properly updated.
Hi Jens, Thanks for looking into this. I am still lost. For "Refine Behaviors' - I did choose a (new) DLC-ed video and associated pose data - but now did it again (.mp4 and pose data) . Then I clicked '‘Start frame extraction’' and it extracted frames as before and put (about 8000) frames into a '...\videos\' directory and a 'refine_params.sav' file into a '...\iteration-0\' directory (as before). Am I supposed to do something with these frames and '.sav' file? If so, what? If I do nothing with them and continue on to 'Create New Dataset' it gives me the error I previously described.
Thanks again, kip
From: Jens Tillmann @.> Sent: Friday, February 2, 2024 1:21 AM To: YttriLab/A-SOID @.> Cc: kipkeller @.>; Author @.> Subject: Re: [YttriLab/A-SOID] Creat New Dataset gives: local variable 'new_features' referenced before assignment (Issue #65)
Thank you for your indepth documentation of your progress. Given that our workflow wasn't clear to you from the app alone, I will use this opportunity to increase the level of detail of the workflow.
Concerning your issue
For clarity, the "Refine Dataset" step is an optional step to increase your classifier's performance on new unlabeled data by giving you the ability to do "manual active learning". Depending on your classifier's performance it might not be necessary.
The step requires uploading a pose estimation file and corresponding video, which will then use the latest active learning classifier to predict your behavior classes and give you a number of low-confidence samples to label.
The current way to select uploaded video/pose is to use the drop down menu and select the video name.
Because you unintentionally skipped doing that, the next step, "Create new Dataset," is not working.
Concerning Project setup:
If the label files have the same name or order as the pose files (and you have many), you can use the folder import to automatically sort them in the selection window.
You are right that the GUI component we use to do this can be tricky with long filenames. Unfortunately, it is a required step for the import of data from various sources given the current split pose/label files.
I would not recommend altering the configuration file. This can have some unforeseen side effects.
If you really want to do that this way, please also delete the files generated by the corresponding step, or redo the step, if possible, using the GUI. Otherwise, you might have data that was not properly updated.
I appreciate your insight and documentation; if there is any additional uncertainty, please report it.
— Reply to this email directly, view it on GitHubhttps://github.com/YttriLab/A-SOID/issues/65#issuecomment-1923396115, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHZUXWARWNWMMYZFQ7HYCKTYRSVY7AVCNFSM6AAAAABCTRAOKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRTGM4TMMJRGU. You are receiving this because you authored the thread.Message ID: @.***>
We extract the frames to create snippets of the full length video that represent each bout that needs refinement. This is the first step in the refinement process and you should be able to start refinement then in the same tab. Maybe the GUI is not clear enough? Could you send some screenshots with your next post to pinpoint were exactly you are right now. Thanks!
I am appending some screen shots here for each step (1-4. 5 is probably a step too far) along the way. I believe it is the step I label as "step4" ('Refine Behaviors') that you are suggesting I should do more with - but I am not sure what 'more' is. Thanks:
Okay, I think I know what is the issue. Thank you for being so thorough here.
The "refine behaviors" step needs you to refresh manually by pressing the 'R' key on your keyboard while A-SOiD is the active window (i.e., you haven't clicked anywhere else prior). This forces the app to reload and then give you the next substep (labelling videos). I will try to recreate this on my side and see if I can tweak the app to do this automatically.
Let me know if this solves your issue!
Thanks (but little progress): pressing 'r' (but not 'R') does progress to an updated screen:
But then pressing 'predict labels and create example videos' leads to the error below:
I am on it. I appreciate your patience. For now I'd say you can already increase the performance by increasing the number of iterations during automatic active learning and/or increase the number of samples per iteration.
The Manual refinement step is optional and the steps predict + discover work after you trained a model in the active learning step.
Hi Jens, This was helpful - and I will get to that in a moment. First, I increased the max iterations from 100 to 200 (but it stopped after about 115) and also increased the samples per iteration from 60 to 100 (no idea if these are reasonable values):
Notice that the 'groom' barely moved. There may be only a few 'grooms' in the particular videos chosen. These annotations are really only my first shot at it to see if it's worth spending the time in Boris to do this better. Annotation is a learned sport.
I continued on as before, but then skipped (from refine and its error) to the 'predict' step (didn't realize I could do that). This gave me the following:
So this is helpful because it seems to show a problem with the pose CSV files - in the 'upload data' step Asoid correctly found the 'animals to include' as mouse and cricket (I excluded cricket for now) and the keypoints ('nose','Lear','Rear','headbase','spine','tailbase' - I excluded 'anteriorC' and 'posteriorC' as those belonging to the cricket). Here, in the 'predict' tab it is reading the CSV differently (or the CSV is wrong) and lists the keypoints as 'mouse' and 'cricket'. This could also explain the error in the earlier step ('refinement') - but I can only guess there. I will try to upload a truncated version of the pose CSV so you will see the header and first few lines of data: 2022-07-27_16-40-55_mouse-1128_DLC_DEMO.csv
Maybe I need to change this CSV file? I needed to smooth, interpolate and eliminate NaNs from the DLC output. I did this in MATLAB and maybe this caused a problem - although it looks ok to me (and asoid seemed to work fine with it in the initial steps). Thanks
The recent update should fix most of the encountered issues and increased the level of in-app documentation significantly.
Class is not increasing in performance:
One potential reason is that the class does not have enough examples. An indication for this is also the equally bad performance from the oneshot classifier trained on all training data at once. However, your active learning parameters suggest that this class contains roughly the same number of samples than the others.
If you have more labeled examples of groom available, your best option will be to add them to the initial data.
You could also use the Refinement step to add examples of groom more efficiently. Here you would have to identify bouts that are misclassified and correctly assign them to groom. This might take some time though, because your classifier seems to completely miss out on groom so far.
Alternatively, you could remove the class in a new project and include "other" this time, which will result in a classifier that catches grooming and other - unlabeled - behaviors collectively.
Copy the equivalent DLC csv files to c:/users/.../A-SoiD/DLCposeData These files must have NO NaNs and so will likely have to be massaged versions of the DLC output. Note that the first four rows (for multianimal models) of the DLC output files are ‘string’ formatted (whereas thereafter the data are ‘doubles’). ASOiD uses the data in rows 2-4 to get keypoints and animalIDs. Once you have massaged these copied data to remove the NaNs, again give them filenames that will easily alphabetize them with the Boris files.
AFAIK DLC outputs low confidence values and does not clean/remove data in their raw output. Maybe I missed a functionality. Can you elaborate on this? We are already doing likelihood filtering (DLC) and NaN interpolation (SLEAP), so if this is common, I am happy to extend this to DLC data.
Using WINDOWS, Firefox, ... worked with CAlms21 - as far as that goes (not very). Now trying with our own dataset. Since I really have no clue if anything here is correct, I include here my notes from the start:
Using our own data from Boris and DLC:
Make four directories within c:/users/.../A-SoiD …/BorisData …/DLCposeData …/DLCvideos …/output USING BORIS: After annotating an observation (video): Observations -> Export events -> as Behaviors Binary Table Select all Select all