dattalab / keypoint-moseq

https://keypoint-moseq.readthedocs.io
Other
77 stars 28 forks source link

Determining window size for grid videos #171

Open ylvabremer opened 1 month ago

ylvabremer commented 1 month ago

Hello, and thank you so much for this well documented, interesting and useful tool!

It would seem that I am having problems with the cropping of my videos when making grid videos. Either the videos get totally dark (I reckon it is zooming in at some small part of the image) or the videos get super small in each grid. I get the message "Using window size of 336 pixels" when the grid movies turn all dark. And I get the message "Videos will be downscaled by a factor of 0.19 so that the grid movies are under 1920 pixels. Use max_video_size to increase or decrease this size limit." when the videos get very small in each grid.

The videos I am using are stitched from two rotated 640 × 400 videos, resulting in the resolution of 800 × 640. I am assuming this is causing the problem, as there was no problem when making grid videos of each separate video before stitching.

I have been playing around with different window sizes, but it does not seem to give me a trend. As of now I am able to see the largest zoomed video when using window_size = 1680. Increasing the max_video_size only serves to get a better resolution and not a larger grid image.

All comments would be greatly appreciated

calebweinreb commented 1 month ago

Hi,

Hmm that's odd. The videos could be all dark because of extreme zooming as you mentioned, or because something weird has happened with the model's centroid estimates. The model's centroid estimates are stored in results. Separately, we provide a way to estimate centroid and heading directly from the keypoints (without the intervention of a model) via centroids, headings = kpms.get_centroids_headings(coordinates, **config()). So you could compare them by plotting e.g.

import matplotlib.pyplot as plt
key = sorted(results.keys())[0]
plt.plot(results[key]["centroid"][:,0])
plt.plot(centroids[key][:,0])

The two curves should match very closely. If they don't, then the "full" stage of modeling went somehow awry. It's possible that you still have reasonable syllables anyway. If so, you could generate grid movies by passing the centroids and headings directly...

kpms.generate_grid_movies(..., centroids=centroids, headings=headings)

If the syllable's don't look reasonable, then you may get better results by just running the ar_only phase of modeling for many more iterations (~500) and then skipping the full modeling stage.

If it turns out that the centroids do match (i.e. this is not the issue), then I'm not sure what's going on. Could you confirm in that case that the videos still seem super zoomed (in or our) even when you manually set the window size?

ylvabremer commented 1 month ago

Thank you for a rapid answer!

This only rotates the images to the left, probably because I set fix_heading=True . The results[key]["centroid"] differs somewhat from the centroids estimated from the keypoints: image

I am running the ar_only now, and will see if this has an impact.

More importantly, when I now did calibration (which I did not do for the initial run) I see that the keypoints do not overlap on the video as expected. They are placed somewhere outside and under the video's frame. This kind of explains why the grid videos are zoomed into the nothingness outside the video previously. I tried to investigate this by making grid videos with keypoint overlay, but then the keypoints are not visible, only the gray dot that shows when syllables are active. The gray dot is also under the video's frame as seen for calibration. Then I guess my next question would be if there is a practical way of aligning the keypoints to the video? I will be looking more into this myself aswell.

Again thanks a lot for this much appreciated help!

calebweinreb commented 1 month ago

Hmm the mystery deepens. First, based on the plot above, it seems like the full modeling is actually doing a good job disregarding noise in the keypoint detections (which presumably cause the spikes in the estimates).

For calibration, I wouldn't necessarily trust that since users have reported weird bugs on certain operating systems. A more reliable way to check if the keypoints match the videos would be this function: https://keypoint-moseq.readthedocs.io/en/latest/viz.html#keypoint_moseq.viz.overlay_keypoints_on_video.