Closed wjkang619 closed 1 year ago
Hello, I have same issue when the some dance video in youtube was input. I think changing get_human_features function like the same function in phalp of 4D-humans (the project from same research team?) could be a clue. or should we just process video without non-people-detected frame? (for example and in detail, in the code, when the length of 'pred_masks' is zero like below)
############## pred_masks, pred_scores, pred_classes, gt_tids, gt_annots = self.get_detections(image_frame, framename, t, additional_data, measurments)
please, notice us this is it right way to solve this issue.
Thank you for bringing this to our attention. This was fixed in the original PHALP repo, so if you try to install again, the video you provided should run without issues.
Hello and Thank you for fast reply.
We inserted the changed code of PHALP to my PHALP.py code as you mentioned before.
if(NPEOPLE==0): return []
So, the 'file_name.pkl' file, the result of PHALP, was generated. But another error was occurred in run_opt function of run_opt.py
We think non-peaple-detected frame is still problem. Actually the targets in input video was wearing similar clothing. In detail, the ID tracking result was not good, tracked ID was not maintained, and the tracked ID(the color of overlapped mesh) in early frame was different in last frame. Even the ID was changed frequently.
Therefore the dataset has few frames of specific ID. The error position of code and the detail of dataset variable is below.
And the B is 0 and T = 13 But why the B is 0, and T = 13. The number of extracted images from video was 98. If the reason of this situation is the tracking accuracy of PHALP, is it limitation of PHALP?
Is there any way to solve this issues?
please save us. Thanks!!
Hello, again. We figured out why the error was occurred! We didn't understand the shot dividing implemented in the slahmr. And in some shot, such as intro or outro of dance video, there is no people and it was the problem. Therefore we changed the shot_idx parameter in configuration file. Then now, we can properly control the project using input that contain people. But another question is bellow. Actually, in case of dance data, the number of frames (input) is more than 4000. And the slahmr could not calculate all result in one time since the limitation of GPU memory. In our case, We are using RTX 3090 and the maximum size of input data that can be optimized in onece was lower than 1000 frames. Is there any plan to develop the code that calulates all after result in once, or automatically calculates next frames like batch of deep learning. Or is there any options that could make lower gpu memory utilization we didn't know.
Thank you!!
For longer videos, we recommend breaking them down in smaller sequences (e.g., up to 200-300 frames), and running slahmr on each one of them. See also #17.
Thank you for sharing such a great project.
However, I would like to ask you one thing.
I have noticed that the current demo code stops when it does not detect a person.
It is difficult for me to figure out how to modify the demo code to solve this problem.
Could you please help me?
I will share the video file I used.
Link: https://drive.google.com/drive/folders/1UKngAcv3nVajudup_h9xNZwaOryIjABO?usp=sharing
Description -'le_crop_1.mp4' : X video without people at the beginning -'le_crop_2.mp4' : O video with all detected people