Open Toshea111 opened 1 year ago
Hi @Toshea111! Makes sense. Just a couple questions.
How do you see the anchored ROIs being defined - is it enough so specify a diameter/width for a circle/rectangle, or do you ever see yourself using polygon anchored ROIs (which is a little trickier to define the size of a single entry box)?
As you say - for each anchored ROI and frame, we can find which other anchored ROIs and which other animal key-points overlaps with it. Is that all the info you need? Those outputs will be in string format, not numerical, and won't immediately fit into any downstream Ml algo. If for ML I guess it would have to be transformed to counts or some sparse table with categoricals.
Hello Simon,
Thank you for the rapid response, to answer your questions:
I am happy to discuss in more detail if any of the above points need further clarification.
Got it - thanks @Toshea111. To find the boundary boxes, the most straightforward might be if used input four key-points (anterior/posterior and the laterals) or two key-points (anterior/posterior) + some user-defines extra metric space to get the boxes. Once done, we could do a few revisions after you're feedback and you've tried it, I can see how this could be useful for others.
The alternatives that would work for you are to extract the black blobs from the white background (e.g. cv2.findContours), or find the animal boundaries through motion (e.g., cv2.calcOpticalFlowPyrLK). You can also find the animals with object segmentation (YOLOv5 is nice). But not sure this would generalize to other setups, but in case this doesn't work out a few things to try.
Much appreciated, either of those initial setups would work well for what I am after, and I would be happy to 'beta test' any additions and provide feedback. I have some limited experience with YOLOv5s, however this system would allow me to feed my data into an established pipeline.
Let me know if you have any additional questions in the meantime.
Sounds good, typed something up based on shapely polygons with use-defined buffers thats relative quick and seems to work. Not sure if I am missing something for non-shape shifters like your species where all the body-parts are parallel line like though.. any chance you could share a raw video and some pose-estimation data, just a snippet, don't need a lot.
Excellent, attached is a short video and corresponding pose estimation data in '.HDF5' format. I can also the provide the data in '.csv' or '.slp' formats, if preferred.
If you need a longer video, I have another with ~8000 frames.
That works cheers. Draft classes for finding, visualizing, and calculating stats for "anchored" ROIs live here. Need to clean, get in GUI, document, and final method for aggregate stats, maybe some time next week. Speed depend on mainly CPU count and number of animals, but hopefully doable..
I look forward to trying it, I will potentially have access to a modelling computer next week, thus I should be able to test it out on both that and my normal machine.
@Toshea111 - do you have a csv version of the termites test.h5 as well to share?
Yes, apologies for the delay, see attached.
Perfect, I really appreciate the rapid turnaround with this.
The user-defined settings will be quite important for termites in particular, as they engage in distinct types of trophallaxis at both ends, so to speak.
I also work with ants and bees, thus I should be able to provide some feedback in terms of performance across species.
@Toshea111 - i've updated the pip package of simba, to include the class calls in the GUI - you should see it if you do pip install simba-uw-tf-dev --upgrade
. I typed up a first pass doc tutorial here: https://github.com/sgoldenlab/simba/blob/master/docs/anchored_rois.md
Not sure if you have got your slp data imported to SimBA, but that needs to happen for this to work. I am also not sure if I have overlooked anything that prohibits this to scale to large datasets but will stress test. If you see anything useful/necessary missing in the docs, can me know and we can see how we can get it in.
One thing missing is probably visualizations validating the output statistics (e.g., show intersecting bounding boxes / keypoints in some alternative salient colors...) like when another animals roi or body-part is in another animals ROI the regions goes thicker and change line color or something.
Very impressive, I will conduct a comprehensive trial of the system over the weekend, and let you know if I have any suggestions or feedback.
I can fork and edit the tutorial document as I go, or provide feedback separately here, whichever would be more convenient for you.
Thanks! Not much work, the people developing shapley has written most of what is needed. What ever works best for you, I don't mind!
Having had a decent run through the new anchored ROI features, they work very well for my use-case. I thought it would make sense to divide my feedback into separate lists for issues and errors that I encountered, and suggestions for additions.
Issues:
Suggestions:
I will go through the tutorial document and make any suggestions or edits that I think will help to clarify things for users. Other than that, I really appreciate you putting this together, as it is already a tractable and robust system for generating interaction data.
Fantastic thank you!!!
All of these are quick fixes, except the sleap import, which is a little more involved. A very brief background:
When I first worked with sleap I only found one inference/output file per video - a .slp
extension file (which is a h5 object). I thought it was a little odd, as this wasn't a user-friendly file. It contained multiple dataframes and dictionaries with the track identities and body-part coordinates in different tables, missing tracks where not easily observed, had to jump through hoops to get it into an interpretable format.
I have long suspected that there must be an alternative output. Then you sent me the multi-index CSV, which is more in line what I would expect... Can you send me an example of the main SLEAP export format of .h5 together with an associated video, and I will write a function to import it? This week is packed. But I will get this done asap.
That makes sense, I generated the ‘.csv’ file from an ‘.h5’ output using the code provided in this Google Collab: SLEAP-IO - Convert to CSV.ipynb - Colaboratory (google.com).
For reference, an outline of the SLEAP ‘.h5’ output format can be found here: Export Data For Analysis — SLEAP (v1.2.9).
Attached is an ‘.h5’ output file and the associated video, I have used the same one that I sent previously, for familiarity. Let me know if you experience any issues, or have further questions.
@Toshea111 fyi if you upgrade through pip, I inserted an option to get detailed data in each interaction bout DETAILED INTERACTION TABLE
, example here, fixed the int/str mix up, replaced the None with ROI_ONLY
, and you can specify the color/size of each animals ROI in visualization. The CSV/H5 import will come!
Excellent, I'll have a go with the new features later this week, and let you know if any issues arise.
Another potential issue that I have encountered is when using tracking data in which the number of tracks varies over time, due to individuals leaving and re-entering the frame. As the config file requires the total number of individuals to be defined, it then returns an error when the number of tracks in frame deviates from this value.
It would be useful to have an option that allowed for such variation in visible tracks, as there are applications where knowing the frequency of interactions or behaviour is useful, even when individual identity cannot be assigned. There may already be a way to accommodate this, but I have not yet found a solution.
Thanks @Toshea111 is it throwing the errorsat import, or during anchored ROI methods?
One question, I tried to use the colab nb and your h5, but was hitting a lot of errors and eventually had to put it aside. Could you help me by sending the entire CSV for the video you sent, and I will work with that CSV to write the SLEAP CSV import methods?
I tried using another video with variable track numbers and it worked, thus I think the previous issue was my own error rather than anything else. The only suggestion I would have is to remove the tracks for any individuals that are not in frame, because currently they appear as '0,0' coordinates.
Looking back through the Colab notebook, I can see the problem. I have now made an updated branch with edits that should make it straightforward to use:
In case you continue to experience issues, I have also attached a folder with the original ‘.h5’ output file, converted '.csv' file, and the associated video. Let me know if you need any additional information during the process.
@Toshea111 there is an option to import csv files from sleap now in the dropdowns. The caveat is that I haven't had time to challenge it much, it worked on the single file you sent but that's all I know for now lol, got to test it a bit more maybe next week.
Much appreciated, I'll try inputting some new data to see if I can break it.
I have had a go with the SLEAP '.csv' format option, and I am running into the same error each time, specifically when I try to import the .csv file.
The error message returned is: 'ValueError: Length mismatch: Expected axis has 12 elements, new values have 24 elements'.
A variation of this occurs for different files with different numbers of tracks, although the example above is for a single track. I assume it means that the .csv file I am uploading contains more information or tracks than expected?
One thing to note is that the track is not continuously in frame for the whole video, perhaps that is an issue in itself?
Thanks for testing @Toshea111! I tried to replicate with the file I have (deleting all or some tracks for subsets of frames) but couldn't, would you mind sharing a CSV like the one with a single track that is causing issues?
Also I noticed in the notebook you shared a while back that the output was transposed relative to the CSV I have been troubleshooting with (attached below). Is the SimBA input data still expected to look like the attached file, or are each individual animal body-part represented as three columns without a track index?
No problem, attached is a .csv file with a single track like the one that was causing issues, I can also provide a version with all the tracks, if needed. Note that the track is not present in all frames, as the hornet disappears from view several times.
The format I have been using is the same as that of 'termites_1.csv', do you have a link to the version of the notebook that you mentioned? The one that I am currently using does not appear to transpose the data as you describe, and the attached output is directly from this notebook.
Thanks @Toshea111, i'll test with this file and let you know - THIS is the notebook I saw.
In this screengrab it looks like there is a single row index, but multiindex headers, while termites_1.csv
has the inverse with multiindex rows and a single header.
That's strange, as you say it looks to be transposed, which is not the output type that I have been working with.
To stay on the safe side, here is a link to the updated version of the notebook that I am currently using:
https://github.com/Toshea111/sleap/blob/develop/docs/notebooks/Convert_HDF5_to_CSV_updated.ipynb
Let me know if any further issues arise.
Hi @Toshea111 - sorry for super late reply, I fixed a method that works with single animal - it doesn't have to be transposed in those cases and didn't think of that. When you've got a chance could you try it on your end again? If it doesn't would, could you please send me the file it doesn't work on?
Hi again @Toshea111, I have a question. My sleap->df function is slow, so I was looking to take some pointers from your code instead. However your h5 files have have different keys then mine, this is what I see in my sleap test files:
frames
instances
metadata
points
pred_points
suggestions_json
tracks_json
videos_json
Do you know if they have changed names, or they have been completely revamped since I created my tracking files?
No problem, I'll have a try and let you know if it works.
Regarding your '.h5' files, could you send me an example?
I want to try having a detailed look in HDFView, to compare the two.
Thanks @Toshea111! These are the files I have, I can't see any node_names
etc. But I created these soon after sleap was released, chances are they changed the structure since
Testing_Video_2.slp.zip
It looks like those are in the old '.slp' format rather than the new '.h5' export format, which as you say is likely a result of using an early version.
To answer your question, I think the system has been revised rather than just renaming, as the data structures are not exactly the same between the two file types.
The website details the process for exporting '.h5' files here: https://sleap.ai/tutorials/analysis.html.
Thanks, super helpful!
Is your feature request related to a problem? Please describe. I am interested in quantifying interactions between several tracked individuals that occur during close contact, with potentially many such interactions occurring at once. This makes it difficult for the behavioural classifier to accurately detect single interactions.
Describe the solution you'd like By including ROIs that move with an individual or body part (thus are 'anchored' to them), this would allow the quantification of such interactions reliably, either through detection of ROI overlaps, or detection of other individuals' body parts.
Describe alternatives you've considered I have tried training a classifier, but the issue is that interactions are variable at any one time.
Additional context I am working with groups of termites tracked in SLEAP, attached is an example video of the tracked data, for reference.
https://user-images.githubusercontent.com/109351104/200376535-4e47a3f7-626e-43df-bd99-081c327ead94.mp4