Open samcunliffe opened 2 years ago
This issue is not urgent for the MVP v0 milestone but it would be useful to have for future milestones.
Registering videos to a common coordinate framework, e.g. via a linear registration to a reference video frame, can greatly reduce the need for defining ROIs for each video (see issue #11).
Here is the sketch of a possible solution:
All of the above could take the form of an "import ROIs from template" button.
I talked today to some developers of the Aeon project, and they suggested an elegant solution to this problem: ArUco markers.
ArUco markers are like QR codes which can be easily detected with computer vision methods. Suppose we embed such a marker at a fixed location in a setup (ensuring it cannot be obstructed or removed by the animals). We could use the marker to automatically translate, rotate, and scale the videos to a standard orientation and zoom factor. The centre of the ArUco marker could be used as the origin of a coordinate system.
Example implementations:
This obviously cannot be applied to the already acquired data, but we could suggest it to the researchers for future experiments.
Any experience with such markers @sfmig ?
Yes actually, I used them in my thesis. We used them to transform the 3D reconstruction we collected with a depth-sensing phone to the coordinate system of the motion capture cameras. Would have to refresh it a bit but yeah, sounds like a good solution :)
I guess for now we can add that as an enhancement? On the webapp side, how would that work? We could have a 'coordinate system registration' tab that transforms the bodyparts' coordinates from DLC (defined in the pixel space of the image) to the coordinate system defined by the ArUco marker.
Another option could be to do the coordinate system registration during data collection, somewhat similar to what we were planning for the event tagging in the future. So with the camera in the position it will be in the experiment, the researcher would scan the aruco marker, and the transformation matrix would be computed and added to the video metadata. Then in the webapp we would apply that transform to the pose data. I just did a quick search but Bonsai seems to have ArUco tools (see Fig 4D in the Bonsai paper, top of page 6/17 in the bioRxiv paper, and this Bonsai package).
I guess a further enhancement could be to add pose tracking with ArUco markers (so no DLC), as they do with the head of the mouse in the paper. It easily provides 3D pose but you do need to stick a marker on the animal (and quite a big one actually). Less important but maybe of interest.
As this solution was suggested to me by a bonsai developer, I'm sure bonsai will have the tools to deal with it during data acquisition.
But we have to discuss it with Sanna and T&T, since they are the ones who would have to implement it in the setups, and there might be limits to how much we can modify the Zoo enclosures.
Other than that, I agree with you that the simplest use-case is computing the transform from the video, and then applying it to the keypoint tracks (applying it to the whole video will result in unnecessary data duplication and processing).
I am a big fan of the idea, it will remove the necessity for manual steps to define coordinates, which is always a big plus (provided it works reliably).
In my experience it works quite robustly yeah.
One drawback is that it does add an extra dependency (since you must use aruco markers to define the coordinate system). What if you want to analyse old data for which you didn't place aruco markers? For ultimate flexibility™️, we may still want to consider alternative ways of registering a common coordinate system. So I think a simpler solution like the ones we discussed based on clicking and fisheye corrections etc, is still worth implementing (in the prototype or in v1) and worth keeping as a fallback in the long term package
I agree, we anyhow already have many videos without markers, so we definitely still need the other solutions as well. Plus it's nice if other groups can use our tools for unmarked data. Ultimately, there should be a few registration options (manual, ArUco, etc.)
But given the scale of the Zoo project, ArUco markers are definitely worth considering.
Hi everyone, What creative ideas! I am a big fan of the ArUco markers. As we move 'up' the phylogenetic tree, we will be making less use of Bonsai with 'stable' camera setups and rely more on video cameras with tripods. These video cameras will likely have to be taken down and replaced every time a video is captured (as they are £££ and require protection from the elements & public). I think the QR codes would be great at mitigating my variabilty in camera positioning. I think the ethics and ability to place stickers in enclosures will be fine. The largest concern would be animals eating the stickers or weather deterioration. In that case, it may be worth checking if the markers can be made of acrylic or metal. However, I think we can cross that bridge if/when we come to it, as I expect paper stickers to work for most species. Let me know if there's anything else to consider.
They can definitely be made out of metal as well. The Aeon project uses metal (I think) plates with ArUCo markers engraved on them (for their octagon setup), and they mentioned ordering these from a company. We can ask them when the time comes.
But for small animals, printing it on a sticker will do. Just need to ensure that it's perfectly straight (no bending) and absolutely fixed at a certain position in the setup. The patterns can be programmatically generated by cv2
and exported to image files ready for printing.
I think the aruco solution is very neat but I agree definitely v1 material.
We did consider implementing a simpler solution as part of v0, and I think @niksirbi started some work on this (I think it is on the frame-registration
branch?). I couldn't find an issue for it, so I opened #55 for now and added it to v0.
That's right, I am working on it on that branch, and it's good that you opened a separate issue for it. Let's keep this one as a reminder for a more permanent solution (e.g. ARUCO)
Possible considerations: the camera position may move. For e.g. heatmaps will need some common reference/alignment solution.