mks0601 / 3DMPPE_POSENET_RELEASE

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019
MIT License
807 stars 147 forks source link

Different result in demo.py and test.py for customized dataset #97

Closed KamiCalcium closed 2 years ago

KamiCalcium commented 2 years ago

Hi,

I wanted to apply PoseNet to test my own dataset. However, I found that the result from demo.py and test.py are very different.

In demo.py, each time, we pass a single processed image that only contains one person, this is the code that we used for processing:

    bbox = process_bbox(np.array(bbox_list[n]), original_img_width, original_img_height)
    img, img2bb_trans = generate_patch_image(original_img, bbox, False, 1.0, 0.0, False) 
    img = transform(img).cuda()[None,:,:,:]

Then we feed "img" to the model:

    # forward
    with torch.no_grad():
        pose_3d = model(img) # x,y: pixel, z: root-relative depth (mm)

The result I get in demo.py is very correct for both 2D and 3D estimation.

However, when I use test.py (I define a new class for my dataset like you did for other datasets), the result is very different. After debugging, I found that starting here it is different:

        for itr, input_img in enumerate(tqdm(tester.batch_generator)):

            # forward
            coord_out = tester.model(input_img)

Then we pass this "coord_out", which has a shape of (128,21,3) to evaluate. But when I take coord_out[0] and compare with the first person's pose_3d in demo.py (using the same pre-trained model), I found them very different. Then I found that even the img and input_img[0] is different.

Is that because in test.py we gave the whole batch 128 people to the network without cropping and it has a problem? How can I address this?

Thanks!

mks0601 commented 2 years ago

test.py also uses a cropped single person image. see here

KamiCalcium commented 2 years ago

test.py also uses a cropped single person image. see here

I think I found the problem. I try to do

print(data['image_path'])

after https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/data/dataset.py#L58 I found that everytime I run test.py the image name is different. Furthermore, I print the index in https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/data/dataset.py#L32 and found that everytime I run test.py the index are coming in a random order. Is there a way to handle this?

I did notice that there is a shuffle option in https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/common/base.py#L150. But clearly in tester we set it to false. I don't know why it is still shuffling the data.

mks0601 commented 2 years ago

That is because of the multiple threads (num_workers). It spawns multiple processes. I'm actually not sure what is your problem, but current testing codes run without problems.

KamiCalcium commented 2 years ago

That is because of the multiple threads (num_workers). It spawns multiple processes. I'm actually not sure what is your problem, but current testing codes run without problems.

Okay let me clarify again. Yes the test.py works for the datasets you release in code. Now I'm trying to test for my own dataset. When using demo.py, it loads the images individually, it works perfect. Like: image However, when I run test.py using the same pre-trained model, the predicting pose is very wrong. Like: pose2 Everything is the same other than in demo.py, I load the cropped image one by one and in test.py, it is loaded in a batch manner.

I create my own dataset class imitating your Human36M.py, however, the logic is not very clear to me and I think that is the problem: In Human36M.py

    def evaluate(self, preds, result_dir):

        print('Evaluation start...')
        gts = self.data
        assert len(gts) == len(preds)
        sample_num = len(gts)

        pred_save = []
        error = np.zeros((sample_num, self.joint_num-1)) # joint error
        error_action = [ [] for _ in range(len(self.action_name)) ] # error for each sequence
        for n in range(sample_num):
            gt = gts[n]

For the easiest case, let's say if we only have 3 cropped person in a single image, then len(gts) and len(preds) will be 3. However, the order in gts and preds are different like I said in https://github.com/mks0601/3DMPPE_POSENET_RELEASE/issues/97#issuecomment-894376904 In preds (from getitem in DataLoader), the index can be 2,1,0 or 2,0,1 or any random stuff due to the multi-thread setting. But in gts, since we have this:

        for n in range(sample_num):
            gt = gts[n]

the index will always be 0, 1, 2. If the index is different, how does it work then?

Apparently it works for Human36M and it doesn't work for my own dataset but I need to first figure out how this logic works and then debug my dataset. Can you please answer that?

Thanks a lot for your time!

mks0601 commented 2 years ago

Actually, printing index in getitem is not a good way to debug due to the multi-threading. You'd better print img_path in the evaluation function. You can load the image using gt['image_path'] and compare the image and an image passed to the model.

KamiCalcium commented 2 years ago

Actually, printing index in getitem is not a good way to debug due to the multi-threading. You'd better print img_path in the evaluation function. You can load the image using gt['image_path'] and compare the image and an image passed to the model.

ok I did tried

print(data['image_path'])

in getitem and compare with gt['image_path'] in evaluation function, clearly they are not the same order. Any suggestion to fix that?

KamiCalcium commented 2 years ago

To figure out the logic, I do the test for Human36M too. I added

print("image in getitem: ", data['image_path'])

in getitem, and added

print("image in evaluation: ", gt['image_path'])

after https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/data/Human36M/Human36M.py#L161 I got this: print1 and this: print2

Similarly, if I run multiple times, the first printing always in a different random order, but the second printing is always in that sequential order. But results seems to be correct everytime. Now I'm confused. How does it work?

KamiCalcium commented 2 years ago

To figure out the logic, I do the test for Human36M too. I added

print("image in getitem: ", data['image_path'])

in getitem, and added

print("image in evaluation: ", gt['image_path'])

after

https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/data/Human36M/Human36M.py#L161

I got this: print1 and this: print2 Similarly, if I run multiple times, the first printing always in a different random order, but the second printing is always in that sequential order. But results seems to be correct everytime. Now I'm confused. How does it work?

Okay even tho the image in getitem is different everytime, but the pred in https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/main/test.py#L78 is always the same. And obviously in Human36M it is the same correct order as the order in evaluation function. However, in my dataset it is the wrong order and I don't know where the order in pred is coming from. Is there a way to check that?

KamiCalcium commented 2 years ago

It is fixed. Actually multi-threading doesn't give problem since when merging the thread they will still be in a fixed order. That's why the pred is always the same. The problem is actually about bounding box. In demo.py, we do this for every cropped image:

    bbox = process_bbox(np.array(bbox_list[n]), original_img_width, original_img_height)

In the evaluation function of Human36M.py, we actually don't have this step (although in load_data() we have the process function, I'm not sure if the corrected bbox is passed to evaluate()). It still works probably because the shape is already correct. But for my dataset if I don't process_box it creates a problem. So in get_item, I added this:

bbox = process_bbox(np.array(bbox), original_img_width, original_img_height)

Also in my evaluate(), I added the same thing, now it looks correct.

mks0601 commented 2 years ago

What I meant was just printing image paths in evaluation function, not in getitem function. Anyway, good for you to fix the problem. Let me close this issue.