dattalab / keypoint-moseq

62 stars 25 forks source link

downsampling causes errors in calibration #153

Open mshallow opened 3 weeks ago

mshallow commented 3 weeks ago

I have been trying to use the downsampling built into your code database, and it seems successful at downsampling the data (the initial lines for temporal downsampling don't error) but then when I try to run the calibration step, it loads ~30-50% of the frames needed and then errors. The error message is attached here. I am currently trying to downsample from 200 fps to 40 fps, so I set the downsampling factor to 5, I also tried with the example downsampling factor of 2 and ran into the same error. It seems to be indexing something incorrectly after the downsampling occurs.

Code: `# load data (e.g. from DeepLabCut) keypoint_data_path = '/Users/mollyshallow/Desktop/new_demo_project/data' # can be a file, a directory, or a list of files coordinates, confidences, bodyparts = kpms.load_keypoints(keypoint_data_path, 'deeplabcut', exclude_individuals='single')

downsample data

downsample_rate = 5 # keep every 2nd frame coordinates = kpms.downsample_timepoints(coordinates, downsample_rate) confidences = kpms.downsample_timepoints(confidences, downsample_rate)

format data for modeling

data, metadata = kpms.format_data(coordinates, confidences, **config())

kpms.noise_calibration(project_dir, coordinates, confidences, **config(), downsample_rate=downsample_rate)`

Error Message: `Loading sample frames: 49%|█████▊ | 40/82 [00:02<00:02, 16.49it/s]

ValueError Traceback (most recent call last) Cell In[11], line 1 ----> 1 kpms.noise_calibration(project_dir, coordinates, confidences, **config(), downsample_rate=downsample_rate)

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/keypoint_moseq/calibration.py:528, in noise_calibration(project_dir, coordinates, confidences, bodyparts, use_bodyparts, video_dir, video_extension, conf_pseudocount, downsample_rate, kwargs) 525 annotations = load_annotations(project_dir) 526 sample_keys.extend(annotations.keys()) --> 528 sample_images = load_sampled_frames( 529 sample_keys, video_dir, video_extension, downsample_rate 530 ) 532 return _noise_calibration_widget( 533 project_dir, 534 coordinates, (...) 540 kwargs, 541 )

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/keypoint_moseq/calibration.py:110, in load_sampled_frames(sample_keys, video_dir, video_extension, downsample_rate) 102 readers = {key: OpenCVReader(video) for key, video in zip(keys, videos)} 103 pbar = tqdm.tqdm( 104 sample_keys, 105 desc="Loading sample frames", (...) 108 ncols=72, 109 ) --> 110 return { 111 (key, frame, bodypart): readers[key][frame * downsample_rate] 112 for key, frame, bodypart in pbar 113 }

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/keypoint_moseq/calibration.py:111, in (.0) 102 readers = {key: OpenCVReader(video) for key, video in zip(keys, videos)} 103 pbar = tqdm.tqdm( 104 sample_keys, 105 desc="Loading sample frames", (...) 108 ncols=72, 109 ) 110 return { --> 111 (key, frame, bodypart): readers[key][frame * downsample_rate] 112 for key, frame, bodypart in pbar 113 }

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/vidio/read.py:70, in BaseReader.getitem(self, *args, kwargs) 62 def getitem(self, *args, *kwargs) -> Union[np.ndarray, list]: 63 """Wrapper around read 64 65 Args: (...) 68 frame = reader[10] 69 """ ---> 70 return self.read(args, kwargs)

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/vidio/read.py:107, in OpenCVReader.read(self, framenum) 102 """Read the frame indicated in framenum from disk 103 104 Uses sequential reads where possible if using OpenCV to read 105 """ 106 # does checks. if framenum is a slice, calls read recursively. In that case, just return --> 107 output = super().read(framenum) 108 if output is not None: 109 return output

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/vidio/read.py:39, in BaseReader.read(self, framenum) 37 return [self.read(i) for i in self.slice_to_list(framenum)] 38 if framenum < 0 or framenum > self.nframes: ---> 39 raise ValueError('frame number requested outside video bounds: {}'.format(framenum))

ValueError: frame number requested outside video bounds: 270590 Loading sample frames: 49%|█████▊ | 40/82 [00:19<00:02, 16.49it/s]`

calebweinreb commented 3 weeks ago

Hmm weird. To diagnose, could you pick a recording and then tell me the shape the corresponding array in coordinates before and after downsampling, and also the number of frames in the corresponding video?

mshallow commented 3 weeks ago

Yea I can do that! Just using the first video in my list of videos as a test, here are the results of testing that out.

full_array=coordinates['Sky_mouse-0897_2022-05-06T08_16_08DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_filtered_mouse'] np.shape(full_array) (7687, 6, 2)

downsample_rate = 5 # keep every 2nd frame coordinates_down = kpms.downsample_timepoints(coordinates, downsample_rate) confidences_down = kpms.downsample_timepoints(confidences, downsample_rate) downsamp_array=coordinates_down['Sky_mouse-0897_2022-05-06T08_16_08DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_filtered_mouse'] np.shape(downsamp_array) (308, 6, 2)

The length of the original coordinates array should be the full length of frames of the video. The video is ~38s long at 200fps which is around 7600 frames.

calebweinreb commented 3 weeks ago

can you check this systematically?

from vidio.read import OpenCVReader

keys = sorted(coordinates.keys())
videos = kpms.find_matching_videos(keys, video_dir)
for key,video in zip(keys,videos):
    print(len(coordinates[key]), len(OpenCVReader(video)), key)
mshallow commented 3 weeks ago

Using that method, I get these values: 1538 7687 Sky_mouse-0897_2022-05-06T08_16_08DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_filtered_mouse The first seems like it's the downsampled frames, and the second the full length, which doesn't really match the output that I got using the other method, so not quite sure what is going on there.

calebweinreb commented 3 weeks ago

"which doesn't really match the output that I got using the other method".. can you elaborate on that?

also can you do this for all your data? Given the frame number in the error "270590" it seems like the short video you've been testing isn't the one that caused the error.

mshallow commented 3 weeks ago

Here is the output from all the videos: 731 3653 Sky_mouse-0893_2022-07-11T08_36_51DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1216 6077 Sky_mouse-0893_2022-07-12T08_47_30DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1404 7017 Sky_mouse-0893_2022-07-19T09_45_45DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1844 9220 Sky_mouse-0893_2022-07-21T11_59_16DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1368 6838 Sky_mouse-0893_2022-07-31T11_31_56DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1198 5988 Sky_mouse-0893_2022-08-22T10_55_14DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1080 5397 Sky_mouse-0893_2022-08-22T10_56_56DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 3350 16746 Sky_mouse-0895_2022-07-12T10_42_42DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2272 11360 Sky_mouse-0895_2022-07-26T10_45_21DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1972 9859 Sky_mouse-0895_2022-07-29T10_27_36DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 3626 18126 Sky_mouse-0895_2022-07-31T10_26_26DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2195 10974 Sky_mouse-0895_2022-08-15T11_55_18DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1820 9098 Sky_mouse-0895_2022-08-22T10_16_09DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1359 6794 Sky_mouse-0895_2022-08-23T10_49_34DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 12870 64347 Sky_mouse-0896_2022-04-06T10_01_10DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2074 10367 Sky_mouse-0896_2022-04-13T09_03_40DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2255 11271 Sky_mouse-0896_2022-04-27T08_35_14DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1518 7590 Sky_mouse-0896_2022-05-04T09_33_27DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2437 12182 Sky_mouse-0897_2022-04-04T09_15_51DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1404 7017 Sky_mouse-0897_2022-04-08T09_03_27DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 344 1720 Sky_mouse-0897_2022-04-14T08_39_05DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1538 7687 Sky_mouse-0897_2022-05-06T08_16_08DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_filtered_mouse 1589 7943 Sky_mouse-0898_2022-04-12T12_42_28DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2168 10838 Sky_mouse-0898_2022-04-29T09_16_40DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 897 4485 Sky_mouse-0898_2022-05-03T09_52_50DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 11883 59412 Sky_mouse-1337_2023-01-05T11_37_41DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 2656 13277 Sky_mouse-1337_2023-01-13T12_25_28DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 12309 61545 Sky_mouse-1337_2023-01-15T16_46_51DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 13194 65969 Sky_mouse-1337_2023-01-20T11_02_26DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1518 7588 Sky_mouse-1429_2023-01-04T15_51_08DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1572 7858 Sky_mouse-1429_2023-01-15T16_15_26DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1441 7204 Sky_mouse-1429_2023-01-16T18_00_32DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 2027 10131 Sky_mouse-1429_2023-01-24T11_32_56DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1929 9642 Sky_mouse-1429_2023-01-29T12_54_54DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 2496 12479 Sky_mouse-1430_2023-01-16T16_46_28DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1190 5949 Sky_mouse-1430_2023-01-17T10_39_22DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 5683 28411 Sky_mouse-1430_2023-01-23T10_37_18DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 2122 10607 Sky_mouse-1430_2023-01-24T12_15_10DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1472 7360 Sky_mouse-1430_2023-01-24T12_23_54DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1699 8494 Sky_mouse-1430_2023-01-29T13_25_16DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse

And previously to try to figure out the length of the downsampled array I had just looked at one entry in the dictionary of coordinates, and after downsampling, it was (308,6,2) which doesn't match up with taking every 5th frame as the downsampling factor should be telling it to do, the value of 1538 is the correct downsampled length for the coordinates.

calebweinreb commented 3 weeks ago

Hmm it's also confusing where "270590" came from since none of the coodinates are even 1/5 that long...

My guess is that there is some accidental failure to downsample or double-downsampling happening here. Can run the code from a clean starting point and do the following?

1) Load coordinates fresh 2) Run this code block before downsampling

from vidio.read import OpenCVReader

keys = sorted(coordinates.keys())
videos = kpms.find_matching_videos(keys, video_dir)
for key,video in zip(keys,videos):
    if len(coordinates[key]) != len(OpenCVReader(video)):
        print(len(coordinates[key]), len(OpenCVReader(video)), key)

assuming nothing prints, try running calibration.

BTW calibration itself is kind of buggy lately but this troubleshooting will also be useful for the subsequent viz steps

mshallow commented 3 weeks ago

There are no steps in this that downsample at all, were you suggesting just to try this to see if it was a bug with calibration even without the downsampling? When I tried this, there was no output from that code block, and everything for the calibration loaded completely fine. Is there a possibility that something is being concatenated to load the frames for the calibration and that is where the 270590 comes from? I've messed around with a couple of different downsampling rates, and if I set the downsampling rate to 2, the same error occurs, but the value error is then "ValueError: frame number requested outside video bounds: 108236"

calebweinreb commented 3 weeks ago

O sorry yeah I forgot the downsampling. Can you try the following?

keypoint_data_path = '/Users/mollyshallow/Desktop/new_demo_project/data' # can be a file, a directory, or a list of files
coordinates, confidences, bodyparts = kpms.load_keypoints(keypoint_data_path, 'deeplabcut', exclude_individuals='single')

downsample_rate = 5 # keep every 2nd frame
coordinates = kpms.downsample_timepoints(coordinates, downsample_rate)
confidences = kpms.downsample_timepoints(confidences, downsample_rate)

from vidio.read import OpenCVReader
keys = sorted(coordinates.keys())
videos = kpms.find_matching_videos(keys, video_dir)
for key,video in zip(keys,videos):
    if (len(coordinates[key])-1)*downsample_rate >= len(OpenCVReader(video)):
        print(len(coordinates[key]), len(OpenCVReader(video)), key)

kpms.noise_calibration(project_dir, coordinates, confidences, **config(), downsample_rate=downsample_rate)
mshallow commented 3 weeks ago

That still gave the same error.

calebweinreb commented 3 weeks ago

Have you run calibration previously? If so it may looking for frames from a non-downsampled instance of calibration. Look for a file called error_annotations.csv in the project directory, delete if present and try again?

mshallow commented 3 weeks ago

Ok that seems to be what it was doing, deleting the file fixed the error! Thanks!

mshallow commented 3 weeks ago

I also don't know if anyone else has encountered general bugginess with the calibration, but I feel like in more recent times I've tried to use it, the loading of the frames seems really glitchy. It'll load the first couple for me to click through smoothly and then after that you have to advance two or three at a time to get it to change frames or load the image and not just the skeleton, unless I wait a long time (around 10+ seconds), before I try to advance.

calebweinreb commented 3 weeks ago

That's different from other bug reports but glitchiness is the consensus. We're planning to change the backend for calibration in the next release/

mshallow commented 2 weeks ago

Following up on downsampling bugs: initial calibration step is no longer erroring, but after downsampling, there appear to be strange issues with the trajectory plots and grid videos. From what I can tell, it seems like something with the scaling for generating these plots and videos is off. The trajectory plots that are generated after the PCA to initialize the model look like the correct scale (see attached photo) but after training the model, all of the points are overlayed on top of each other (second attached photo). The same issue seems to apply to the grid movies where everything is super zoomed out rather than cropping and zooming in on the mouse like it did previously. None of these lines of code throw actual errors, just user warnings, but I was curious if solving these warnings would change the output or if there is something else going on. Before training model plots from PCA:

Screenshot 2024-06-18 at 3 09 48 PM

Trajectory plots after training:

Screenshot 2024-06-18 at 3 10 13 PM

Grid movies after training:

Screenshot 2024-06-18 at 3 10 40 PM

Screenshot 2024-06-18 at 3 11 35 PM

I tried running this with two different downsampling rates as well as two different kappa values and encountered the same issues.

calebweinreb commented 2 weeks ago


Hmm that's weird! But it doesn't strike me as related to downsampling per se. Have you ever run kpms without downsampling? Did it work in that case?

mshallow commented 2 weeks ago

Yea it has always worked without downsampling. I just tried it again on the same dataset without downsampling since previous attempts without downsampling were run on a different computer, and did not encounter this issue.

mshallow commented 2 weeks ago

This is the same set of outputs from a run without downsampling: Screenshot 2024-06-19 at 2 41 44 PM

Screenshot 2024-06-19 at 2 39 59 PM Screenshot 2024-06-19 at 2 40 08 PM Screenshot 2024-06-19 at 2 40 28 PM

calebweinreb commented 2 weeks ago

Hmm maybe the fitting got wonky for some reason. Can you try exporting the inferred coordinates and see if they're weird?


mshallow commented 2 weeks ago

The first half of that code runs fine, and then when it gets to making the video and overlaying the coordinates, it gives me a similar issue to what I was encountering with the calibration earlier. Error message: IndexError: index 1495 is out of bounds for axis 0 with size 1495 Just by eye, there are a few coordinates that look a little weird ( a bunch of negative values or really high values) but not quite sure how to systematically check that. These are the coordinate outputs from the first video in the dictionary for the downsampled data and not downsampled data. Downsampled: Screenshot 2024-06-19 at 2 51 13 PM

Not downsampled: Screenshot 2024-06-19 at 2 51 26 PM