spoonsso / dannce

MIT License
214 stars 30 forks source link

Dannce-train error and video sync problem #121

Closed SEONGGAP closed 1 year ago

SEONGGAP commented 2 years ago

Hi, I trying to set up dannce for a rat with five Raspberry Pi HQ cameras controlled via GPIO for start/stop recordings. I successfully labeled some frames using Label3D (16 skeletons), and also ran com-train (finetuning method with weights.rat.COM.hdf5) and com-predict without errors. However, I encountered an error below when I tried to run dannce-train (finetuning method with weights/weights.rat.MAX/, obtained from the markerless_mouse_1 folder).


Could not load weights for finetune (likely because you are finetuning a previously finetuned network). Attempting to finetune from a full finetune model file. Traceback (most recent call last): File "c:\windows\system32\dannce\dannce\interface.py", line 1120, in dannce_train fargs File "c:\windows\system32\dannce\dannce\engine\nets.py", line 1129, in finetune_AVG model = renameLayers(model, weightspath) File "c:\windows\system32\dannce\dannce\engine\nets.py", line 1348, in renameLayers model.load_weights(weightspath, by_name=True) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2209, in load_weights f, self.layers, skip_mismatch=skip_mismatch) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 759, in load_weights_from_hdf5_group_by_name layer, weight_values, original_keras_version, original_backend) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 403, in preprocess_weights_for_loading weights[0] = np.transpose(weights[0], (3, 2, 0, 1)) File "", line 6, in transpose File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\numpy\core\fromnumeric.py", line 651, in transpose return _wrapfunc(a, 'transpose', axes) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\numpy\core\fromnumeric.py", line 61, in _wrapfunc return bound(args, **kwds) ValueError: axes don't match array

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\dannce\Scripts\dannce-train-script.py", line 33, in sys.exit(load_entry_point('dannce', 'console_scripts', 'dannce-train')()) File "c:\windows\system32\dannce\dannce\cli.py", line 66, in dannce_train_cli dannce_train(params) File "c:\windows\system32\dannce\dannce\interface.py", line 1126, in dannce_train *fargs File "c:\windows\system32\dannce\dannce\engine\nets.py", line 1201, in finetune_fullmodel_AVG compile=False, File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\saving\save.py", line 182, in load_model return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 175, in load_model_from_hdf5 raise ValueError('No model found in config file.') ValueError: No model found in config file.


I found issue #50 covered a very similar problem, but I followed all the suggested solutions (changing configurations) there but unfortunately, they didn't work.

I'm not sure if it's relevant to the error above or just a different issue, I found that my 5 video files are asynchronous possibly due to the instability of the Raspberry Pi cameras. In issue #93 , I found @diegoaldarondo mentioned that he could synchronize Pi camera videos by capturing every timestamp for each frame, but I can't find how to extract those timestamps.

Any ideas about this problem? Thanks!

data-hound commented 2 years ago

Hi @SEONGGAP

The MAX weights in demo/markerless_mouse_1 folder are for 6 cameras, while you are using 5 cameras (you mentioned 5 video files)? You can use n_views parameter to duplicate one of your cameras (by setting n_views=6).

As for recording timestamps from each frame on Raspberry Pi cameras, I think this article could be of help: https://forums.raspberrypi.com/viewtopic.php?t=106930. Alternatively, you can also use time.time() as mentioned in the accepted answer for this stackoverflow post https://stackoverflow.com/questions/56274861/save-frames-of-live-video-with-timestamps.

SEONGGAP commented 2 years ago

Thanks for replay @data-hound .

Yes, I followed the issue 50, I set the n_views=5. As you mentioned, I will try to duplicate one video and change n_views.

Using the Picamera library, the camera.frame.index does not represent real frame. When I recorded 10s video (fps 30), the camera.frame.index increased to 300, but video contained only 291 frames. I did not use camera.frame.complete, and I am checking it in my setting.

Many thanks!

data-hound commented 2 years ago

Hi @SEONGGAP

You do not have to duplicate a video manually. The code should take care of it when you set n_views=6.

As per why there are less number of frames recorded, I would suggest checking if there were any frame drops or some communication issue due to which the output frames are lesser.

Thanks Anshuman

SEONGGAP commented 2 years ago

You do not have to duplicate a video manually. The code should take care of it when you set n_views=6.

Oh! thanks! I will try it!

As per why there are less number of frames recorded, I would suggest checking if there were any frame drops or some communication issue due to which the output frames are lesser.

I think frame drop causes frame mismatch. If I can get real frame and timestamp, I will check them.

SEONGGAP commented 2 years ago

Hi @data-hound

As you mentioned before, I chaned n_views = 6 and successfully got dannce train results. Using this weights, however, I got an error.

Loading` model from .\DANNCE\train_results\AVG\weights.990-51494.32031.hdf5 Predicting on batch 0 c:\windows\system32\dannce\dannce\engine\generator.py:1221: UserWarning: Note: ignoring dimension mismatch in 3D labels warnings.warn(msg) Loading new video: videos\Camera3\0.mp4 for 0_Camera3 Loading new video: videos\Camera4\0.mp4 for 0_Camera4 Loading new video: videos\Camera2\0.mp4 for 0_Camera2 Loading new video: videos\Camera1\0.mp4 for 0_Camera1 Loading new video: videos\Camera5\0.mp4 for 0_Camera5 Loading new video: videos\Camera1\0.mp4 for 0_Camera1 Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\dannce\Scripts\dannce-predict-script.py", line 33, in sys.exit(load_entry_point('dannce', 'console_scripts', 'dannce-predict')()) File "c:\windows\system32\dannce\dannce\cli.py", line 54, in dannce_predict_cli dannce_predict(params) File "c:\windows\system32\dannce\dannce\interface.py", line 1596, in dannce_predict n_chn, File "c:\windows\system32\dannce\dannce\engine\inference.py", line 696, in infer_dannce ims = generator.getitem(i) File "c:\windows\system32\dannce\dannce\engine\generator.py", line 966, in getitem X, y = self.__data_generation(list_IDs_temp) File "c:\windows\system32\dannce\dannce\engine\generator.py", line 1258, in __data_generation result = self.threadpool.starmap(self.project_grid, arglist) File "C:\ProgramData\Anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 276, in starmap return self._map_async(func, iterable, starmapstar, chunksize).get() File "C:\ProgramData\Anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 657, in get raise self._value File "C:\ProgramData\Anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "C:\ProgramData\Anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 47, in starmapstar return list(itertools.starmap(args[0], args[1])) File "c:\windows\system32\dannce\dannce\engine\generator.py", line 1028, in project_grid extension=self.extension, File "c:\windows\system32\dannce\dannce\engine\video.py", line 231, in load_vid_frame self.currvideo[camname].close() if self.predict_flag else \ File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\imageio\core\format.py", line 259, in close self._close() File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\imageio\plugins\ffmpeg.py", line 343, in _close self._read_gen.close() ValueError: generator already executing

Thanks.

data-hound commented 2 years ago

Hi @SEONGGAP

Are you using the same number of keypoints in your model as that in the weights file. I suspect there might be a mismatch between n_channels_out specified in your dannce_config.yaml, and the actual number of keypoints that you have labelled. Can you please check and confirm?

Thanks

SEONGGAP commented 2 years ago

Thanks for reply, @data-hound

I labeled my video using the rat16 skeletons of the Label3D. And the weights files is the train results of dannce-train, finetuning result of the weigts.rat.MAX from the markerless_mouse_1 files.

I attatch my config file and dannce.mat files here.

dannce_rat_config.txt dannce_mat.zip

I set the new_n_channels_out: 16, and my labeled data is also 16 labels.

Thanks.

d2ncpp commented 1 year ago

Hi, I trying to set up dannce for a rat with five Raspberry Pi HQ cameras controlled via GPIO for start/stop recordings. I successfully labeled some frames using Label3D (16 skeletons), and also ran com-train (finetuning method with weights.rat.COM.hdf5) and com-predict without errors. However, I encountered an error below when I tried to run dannce-train (finetuning method with weights/weights.rat.MAX/, obtained from the markerless_mouse_1 folder).

Could not load weights for finetune (likely because you are finetuning a previously finetuned network). Attempting to finetune from a full finetune model file. Traceback (most recent call last): File "c:\windows\system32\dannce\dannce\interface.py", line 1120, in dannce_train fargs File "c:\windows\system32\dannce\dannce\engine\nets.py", line 1129, in finetune_AVG model = renameLayers(model, weightspath) File "c:\windows\system32\dannce\dannce\engine\nets.py", line 1348, in renameLayers model.load_weights(weightspath, by_name=True) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2209, in load_weights f, self.layers, skip_mismatch=skip_mismatch) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 759, in load_weights_from_hdf5_group_by_name layer, weight_values, original_keras_version, original_backend) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 403, in preprocess_weights_for_loading weights[0] = np.transpose(weights[0], (3, 2, 0, 1)) File "", line 6, in transpose File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\numpy\core\fromnumeric.py", line 651, in transpose return _wrapfunc(a, 'transpose', axes) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\numpy\core\fromnumeric.py", line 61, in _wrapfunc return bound(args, **kwds) ValueError: axes don't match array

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\dannce\Scripts\dannce-train-script.py", line 33, in sys.exit(load_entry_point('dannce', 'console_scripts', 'dannce-train')()) File "c:\windows\system32\dannce\dannce\cli.py", line 66, in dannce_train_cli dannce_train(params) File "c:\windows\system32\dannce\dannce\interface.py", line 1126, in dannce_train *fargs File "c:\windows\system32\dannce\dannce\engine\nets.py", line 1201, in finetune_fullmodel_AVG compile=False, File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\saving\save.py", line 182, in load_model return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile) File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 175, in load_model_from_hdf5 raise ValueError('No model found in config file.') ValueError: No model found in config file.

I found issue #50 covered a very similar problem, but I followed all the suggested solutions (changing configurations) there but unfortunately, they didn't work.

I'm not sure if it's relevant to the error above or just a different issue, I found that my 5 video files are asynchronous possibly due to the instability of the Raspberry Pi cameras. In issue #93 , I found @diegoaldarondo mentioned that he could synchronize Pi camera videos by capturing every timestamp for each frame, but I can't find how to extract those timestamps.

Any ideas about this problem? Thanks!

Hello SEONGGAP

I'm also planning to use RPI HQ cams to run dannce. How did you connect the cams to your workstation? Did you use an RPI for each cam or did you use an adapter like this https://thepihut.com/products/hq-camera-usb-webcam-adapter?

Secondly, how did you configure your cams so you controlled them via GPIO for start/stop recordings?

SEONGGAP commented 1 year ago

Hi, @d2ncpp

How did you connect the cams to your workstation? Did you use an RPI for each cam or did you use an adapter like this https://thepihut.com/products/hq-camera-usb-webcam-adapter?

I am using one HQ cam connected to each RPi, and if I remotely access the RPis using one computer, I can observe all the RPis from that computer. However, if you connect USB cameras to one RPi at the same time, it will be easier to sync, but I don't know if it will work well due to the limited performance of the RPi.

how did you configure your cams so you controlled them via GPIO for start/stop recordings?

After connecting one master RPi and the rest of the RPis (via GPIO), the master RPi sends a signal to the remaining RPis through the GPIO, and through this, start/stop recording is performed. Camera control using GPIO uses python codes. picamera : https://picamera.readthedocs.io/en/release-1.13/ opencv : https://docs.opencv.org/4.6.0/ gpiozero : https://gpiozero.readthedocs.io/en/stable/

It is working normally, but there are frame drop and miss, limited fps and resolution due to RPi performance problems, so there are many parts to be careful and pay attention to. So, I am currently in the process of ordering another camera.

Thanks

d2ncpp commented 1 year ago

Thanks for responding, @SEONGGAP You've been incredibly helpful.

If you don't mind, I have follow-up inquiries regarding your set-up: What specific RPi model/s did you use? Are the master RPi and the other RPi's identical models? And given the limitations you faced, what do you recommend in the process of selecting RPi's?

data-hound commented 1 year ago

Hi @data-hound

As you mentioned before, I chaned n_views = 6 and successfully got dannce train results. Using this weights, however, I got an error.

Loading` model from .\DANNCE\train_results\AVG\weights.990-51494.32031.hdf5 Predicting on batch 0 c:\windows\system32\dannce\dannce\engine\generator.py:1221: UserWarning: Note: ignoring dimension mismatch in 3D labels warnings.warn(msg) Loading new video: videos\Camera3\0.mp4 for 0_Camera3 Loading new video: videos\Camera4\0.mp4 for 0_Camera4 Loading new video: videos\Camera2\0.mp4 for 0_Camera2 Loading new video: videos\Camera1\0.mp4 for 0_Camera1 Loading new video: videos\Camera5\0.mp4 for 0_Camera5 Loading new video: videos\Camera1\0.mp4 for 0_Camera1 Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\dannce\Scripts\dannce-predict-script.py", line 33, in sys.exit(load_entry_point('dannce', 'console_scripts', 'dannce-predict')()) File "c:\windows\system32\dannce\dannce\cli.py", line 54, in dannce_predict_cli dannce_predict(params) File "c:\windows\system32\dannce\dannce\interface.py", line 1596, in dannce_predict n_chn, File "c:\windows\system32\dannce\dannce\engine\inference.py", line 696, in infer_dannce ims = generator.getitem(i) File "c:\windows\system32\dannce\dannce\engine\generator.py", line 966, in getitem X, y = self.__data_generation(list_IDs_temp) File "c:\windows\system32\dannce\dannce\engine\generator.py", line 1258, in __data_generation result = self.threadpool.starmap(self.project_grid, arglist) File "C:\ProgramData\Anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 276, in starmap return self._map_async(func, iterable, starmapstar, chunksize).get() File "C:\ProgramData\Anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 657, in get raise self._value File "C:\ProgramData\Anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "C:\ProgramData\Anaconda3\envs\dannce\lib\multiprocessing\pool.py", line 47, in starmapstar return list(itertools.starmap(args[0], args[1])) File "c:\windows\system32\dannce\dannce\engine\generator.py", line 1028, in project_grid extension=self.extension, File "c:\windows\system32\dannce\dannce\engine\video.py", line 231, in load_vid_frame self.currvideo[camname].close() if self.predict_flag else File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\imageio\core\format.py", line 259, in close self._close() File "C:\ProgramData\Anaconda3\envs\dannce\lib\site-packages\imageio\plugins\ffmpeg.py", line 343, in _close self._read_gen.close() ValueError: generator already executing

Thanks.

Hi @SEONGGAP

Apologies for the delay in getting back to you. Are you still stuck with this issue?

Heres a few thoughts on the issue: Going through the stacktrace again, it seems that the ffmpeg /imageio was not able to read the vid. Is it a recurring issue? If yes, is there any process that might be accessing the videos in a way that prevents imageio ffmpeg to read it?

Thanks