talmolab / sleap

A deep learning framework for multi-animal pose tracking.
https://sleap.ai
Other
435 stars 97 forks source link

Error when running inference on whole video #1508

Closed rabiesrisk closed 1 year ago

rabiesrisk commented 1 year ago

Hi Yall! I am new to sleap and am following the tutorial. I successfully ran the training, but I am stuck on being able to run the inference

When I go to Predict -> Run inference and choose an option other than 'user label frames', the inference hits an error. This is what is in the terminal.

Started inference at: 2023-09-18 18:06:21.801654

Args: { │ 'data_path': '/home/conda_from_source/Documents/Jessica/labels.v002.slp', │ 'models': [ │ │ '/home/conda_from_source/Documents/Jessica/models/230918_172450.single_instance.n=20/initial_config.json' │ ], │ 'frames': '731,1150,4347,4613,5638,7274,10391,12923,14653,20422,21668,26042,28204,28965,29776,34040,40900,45874,46044,53603', │ 'only_labeled_frames': False, │ 'only_suggested_frames': False, │ 'output': '/home/conda_from_source/Documents/Jessica/predictions/labels.v002.slp.230918_180618.predictions.slp', │ 'no_empty_frames': True, │ 'verbosity': 'json', │ 'video.dataset': None, │ 'video.input_format': 'channels_last', │ 'video.index': '0', │ 'cpu': False, │ 'first_gpu': False, │ 'last_gpu': False, │ 'gpu': 'auto', │ 'max_edge_length_ratio': 0.25, │ 'dist_penalty_weight': 1.0, │ 'batch_size': 4, │ 'open_in_gui': False, │ 'peak_threshold': 0.2, │ 'max_instances': None, │ 'tracking.tracker': 'simple', │ 'tracking.target_instance_count': None, │ 'tracking.pre_cull_to_target': 0, 2023-09-18 18:06:23.259412: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. │ 'tracking.pre_cull_iou_threshold': 0.8, │ 'tracking.post_connect_single_breaks': 0, │ 'tracking.clean_instance_count': None, │ 'tracking.clean_iou_threshold': None, │ 'tracking.similarity': 'iou', │ 'tracking.match': 'hungarian', │ 'tracking.robust': None, │ 'tracking.track_window': 5, │ 'tracking.min_new_track_points': None, │ 'tracking.min_match_points': None, 2023-09-18 18:06:23.777966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13147 MB memory: -> device: 0, name: NVIDIA A2, pci bus id: 0000:98:00.0, compute capability: 8.6 │ 'tracking.img_scale': None, │ 'tracking.of_window_size': None, │ 'tracking.of_max_levels': None, │ 'tracking.save_shifted_instances': None, │ 'tracking.kf_node_indices': None, │ 'tracking.kf_init_frame_count': None } Traceback (most recent call last): File "/some/prefix/envs/sleap/bin/sleap-track", line 33, in sys.exit(load_entry_point('sleap', 'console_scripts', 'sleap-track')()) File "/home/conda_from_source/sleap/sleap/nn/inference.py", line 5412, in main labels_pr = predictor.predict(provider) File "/home/conda_from_source/sleap/sleap/nn/inference.py", line 526, in predict self._make_labeled_frames_from_generator(generator, data) File "/home/conda_from_source/sleap/sleap/nn/inference.py", line 1583, in _make_labeled_frames_from_generator for ex in generator: File "/home/conda_from_source/sleap/sleap/nn/inference.py", line 455, in _predict_generator for ex in self.pipeline.make_dataset(): File "/home/conda_from_source/sleap/sleap/nn/data/pipelines.py", line 282, in make_dataset ds = self.providers[0].make_dataset() File "/home/conda_from_source/sleap/sleap/nn/data/providers.py", line 398, in make_dataset self.video.get_frame(self.video.last_frame_idx) File "/home/conda_from_source/sleap/sleap/io/video.py", line 1104, in get_frame return self.backend.get_frame(idx) File "/home/conda_from_source/sleap/sleap/io/video.py", line 496, in get_frame raise KeyError(f"Unable to load frame {idx} from {self}.") KeyError: "Unable to load frame 56819 from MediaVideo(filename='/home/conda_from_source/Videos SH/0801box1/DSCF0001.AVI', grayscale=False, bgr=True, dataset='', input_format='')."

INFO:sleap.nn.inference:Auto-selected GPU 0 with 14905 MiB of free memory. Versions: SLEAP: 1.3.1 TensorFlow: 2.7.0 Numpy: 1.21.5 Python: 3.7.12 OS: Linux-5.19.0-50-generic-x86_64-with-debian-bookworm-sid

System: GPUs: 1/2 available Device: /physical_device:GPU:0

I tried moving the files from a server to the local files, as this solved a similar issue for someone else, but I am receiving the same key error of being unable to load the frames. It works beautifully when I select the user labeled frames, unsure what I am doing wrong. I currently have 4 videos in the project with each being ~60k frames. This is for a single animal prediction, 4 nodes to the animal.

I am running on Linux.

roomrys commented 1 year ago

Hi @rabiesrisk,

We have seen a few problems like this occur when the video is not reliably seekable. Try reencoding your video and let us know how that works.


Update: If reencoding doesn't work, and results in moov atom not found

As I was helping a user, we found that they had just copied the contents of a zipped file over without unzipping which resulted in first the KeyError, and then, when attempting to reencode, a moov atom not found error. After retrying but with unzipping first, we were able to get everything running smoothly.


Thanks, Liezl

Related: #767

rabiesrisk commented 1 year ago

Hiii I re encoded the videos and it still does not work OTL It gives the same error, I tried it for each different video and the frame it says it is unable to load is always the second to last frame. I did not zip these files/encounter that related error.

Thank you for your patience, Jessica

Started inference at: 2023-09-21 17:11:11.727666 Args: { │ 'data_path': '/home/conda_from_source/Documents/Jessica/labels.v002.slp', │ 'models': [ │ │ '/home/conda_from_source/Documents/Jessica/models/230918_172450.single_instance.n=20/initial_config.json' │ ], │ 'frames': '0,-65699', │ 'only_labeled_frames': False, │ 'only_suggested_frames': False, │ 'output': '/home/conda_from_source/Documents/Jessica/predictions/labels.v002.slp.230921_171108.predictions.slp', │ 'no_empty_frames': True, │ 'verbosity': 'json', │ 'video.dataset': None, │ 'video.input_format': 'channels_last', │ 'video.index': '3', │ 'cpu': False, │ 'first_gpu': False, │ 'last_gpu': False, │ 'gpu': 'auto', │ 'max_edge_length_ratio': 0.25, │ 'dist_penalty_weight': 1.0, │ 'batch_size': 4, │ 'open_in_gui': False, │ 'peak_threshold': 0.2, │ 'max_instances': None, │ 'tracking.tracker': 'simple', │ 'tracking.target_instance_count': None, │ 'tracking.pre_cull_to_target': 0, 2023-09-21 17:11:13.242222: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. │ 'tracking.pre_cull_iou_threshold': 0.8, │ 'tracking.post_connect_single_breaks': 0, │ 'tracking.clean_instance_count': None, │ 'tracking.clean_iou_threshold': None, │ 'tracking.similarity': 'iou', │ 'tracking.match': 'hungarian', │ 'tracking.robust': None, │ 'tracking.track_window': 5, │ 'tracking.min_new_track_points': None, │ 'tracking.min_match_points': None, 2023-09-21 17:11:13.772148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13147 MB memory: -> device: 0, name: NVIDIA A2, pci bus id: 0000:98:00.0, compute capability: 8.6 │ 'tracking.img_scale': None, │ 'tracking.of_window_size': None, │ 'tracking.of_max_levels': None, │ 'tracking.save_shifted_instances': None, │ 'tracking.kf_node_indices': None, │ 'tracking.kf_init_frame_count': None } Traceback (most recent call last): File "/some/prefix/envs/sleap/bin/sleap-track", line 33, in sys.exit(load_entry_point('sleap', 'console_scripts', 'sleap-track')()) File "/home/conda_from_source/sleap/sleap/nn/inference.py", line 5412, in main labels_pr = predictor.predict(provider) File "/home/conda_from_source/sleap/sleap/nn/inference.py", line 526, in predict self._make_labeled_frames_from_generator(generator, data) File "/home/conda_from_source/sleap/sleap/nn/inference.py", line 1583, in _make_labeled_frames_from_generator for ex in generator: File "/home/conda_from_source/sleap/sleap/nn/inference.py", line 455, in _predict_generator for ex in self.pipeline.make_dataset(): File "/home/conda_from_source/sleap/sleap/nn/data/pipelines.py", line 282, in make_dataset ds = self.providers[0].make_dataset() File "/home/conda_from_source/sleap/sleap/nn/data/providers.py", line 398, in make_dataset self.video.get_frame(self.video.last_frame_idx) File "/home/conda_from_source/sleap/sleap/io/video.py", line 1104, in get_frame return self.backend.get_frame(idx) File "/home/conda_from_source/sleap/sleap/io/video.py", line 496, in get_frame raise KeyError(f"Unable to load frame {idx} from {self}.") KeyError: "Unable to load frame 65699 from MediaVideo(filename='/home/conda_from_source/Videos SH/0801box1/DSCF0004.AVI', grayscale=False, bgr=True, dataset='', input_format='')."

INFO:sleap.nn.inference:Auto-selected GPU 0 with 14905 MiB of free memory. Versions: SLEAP: 1.3.1 TensorFlow: 2.7.0 Numpy: 1.21.5 Python: 3.7.12 OS: Linux-5.19.0-50-generic-x86_64-with-debian-bookworm-sid

System: GPUs: 1/2 available Device: /physical_device:GPU:0 Available: True Initalized: False Process return code: 1

talmo commented 1 year ago

Hi @rabiesrisk,

I see from the logs that it's still predicting on /home/conda_from_source/Videos SH/0801box1/DSCF0004.AVI, which indicates that the video is still the original AVI.

If you reencoded, perhaps you need to replace it in the GUI (File -> Replace videos...) with the MP4 and resave the labels file.

Alternatively, you can try predicting directly on the MP4 file by using the sleap-track CLI (the relevant commands are in the logs above).

Cheers,

Talmo

rabiesrisk commented 1 year ago

Well that was silly of me, it works perfectly! Thank you!!