bharat-b7 / MultiGarmentNetwork

Repo for "Multi-Garment Net: Learning to Dress 3D People from Images, ICCV'19"
285 stars 65 forks source link

Consultation on test_data.pkl and assets folder #8

Closed Huacuo closed 4 years ago

Huacuo commented 4 years ago

Hi,

I use the test_data.pkl and the file test pictures under the assets folder provided by you to generate good results. Thank you very much for your work. I have a few more questions for you:

1) if I want to test my own data, can I still use the files in the assets folder? Or I need to generate the same file with my own data

2) I noticed that there are several other data in the test.pkl file besides the data generated by PGN and openpos, such as "rendered", "Laplacian", "vertex label", etc., so can you tell me how to get these files.

Look forward to your reply!

bharat-b7 commented 4 years ago

Hi, For a forward pass with MGN you need:

  1. Segmented images of a person (image_x in test.pkl)
  2. 2D joint detections from openpose (J_2d_x)
  3. vertexlabel (this tells MGN which SMPL vertex belongs to which garment and can be generated using the garment template provided in assets/allTemplate_withBoundaries_symm.pkl

To fine tune MGN at test time to get more person specific details, you need:

  1. 2D silhouette for each input frame (rendered in test_pkl)
  2. laplacian regularization (this is just a zero vector to smoothen the output)
wlhsiao commented 4 years ago

Thank you for your kind reply. I'm also trying to experiment with my own data, but still not clear about how to get vertexlabel.

Given an input 2D image, I would run segmentation to get image_x, and run openpose to get J_2d_x. These are all 2D information, but vertexlabel assigns a garment label to each SMPL vertex. What I haven't figured out is, how can I get vertex information from 2D? Is it achieved through image_x and allTemplate_withBoundaries_symm.pkl? Is it possible to share with me more details on how to use these files to obtain vertexlabel? Thanks in advance for any advice!

bharat-b7 commented 4 years ago

Hi, vertexlabel is derived from the garment classes present in your image_x. Eg.: If the image contains T-shirt and Pants, just select the vertices corresponding to these classes from allTemplate_withBoundaries_symm.pkl.

wlhsiao commented 4 years ago

Thanks for kindly explaining! Yes, I also found your code ind= np.where(TEMPLATE[config.garmentKeys[gar]][1])[0] in test_network.py that selects these classes. This is really helpful. Thank you!

neonb88 commented 4 years ago

Hi,

@bharat-b7 Thank you for all your hard work.

I noticed that each "image" within test_data.pkl contains an image of 2 different people. Why is this? Does test_network.py retarget the clothes from person 0 to person 1's body? Or does the code just run the same process on both people? Or do you ignore the 2nd person?

I will post the code to which I'm referring later:

  # TODO: nathan

Thanks, Nathan

bharat-b7 commented 4 years ago

test_network.py runs the same process (images + 2d joints --> body + 3d garments) for both the subjects.

andrewjong commented 4 years ago

@bharat-b7 I see "rendered" is the 2D silhouette for each input frame, but what is the last dimension of size 8 in the shape (2, 720, 720, 3, 8)?

Edit: Nevermind, I think I know this one. The 8 corresponds to the 8 image_x's, image_0-7. It's just bundled in one array instead of in separate keys.

andrewjong commented 4 years ago

@bharat-b7 Also why is there both J_2D_x and J_x in test_data.pkl? What is J_x?

bharat-b7 commented 4 years ago

J_x is used during training to supervise 3D joints. This is not used during inference.

Frank-Dz commented 4 years ago

Hi, For a forward pass with MGN you need:

  1. Segmented images of a person (image_x in test.pkl)
  2. 2D joint detections from openpose (J_2d_x)
  3. vertexlabel (this tells MGN which SMPL vertex belongs to which garment and can be generated using the garment template provided in assets/allTemplate_withBoundaries_symm.pkl

To fine tune MGN at test time to get more person specific details, you need:

  1. 2D silhouette for each input frame (rendered in test_pkl)
  2. laplacian regularization (this is just a zero vector to smoothen the output)

How did you get 2D silhouette?

Thanks!

neonb88 commented 4 years ago

@Frank-Dz should be able to get it from the CIHP-PGN segmentation (image_x in test_data.pkl). Just figure out which label is the background and make that one value, then make all other labels the foreground. You can find which values you need to set them to in rendered

Frank-Dz commented 4 years ago

@Frank-Dz should be able to get it from the CIHP-PGN segmentation (image_x in test_data.pkl). Just figure out which label is the background and make that one value, then make all other labels the foreground. You can find which values you need to set them to in rendered

Hi~@neonb88 Thanks!
image

I learned that we need to run PGN and manually label the result? I still do not understand it. Or can you tell me how to make a PGN result into the format in text_data.pkl?

Frank-Dz commented 4 years ago

For example, in PGN's test_pgn.py

# evaluate prosessing
    parsing_dir = './output/cihp_parsing_maps'
    if not os.path.exists(parsing_dir):
        os.makedirs(parsing_dir)
    edge_dir = './output/cihp_edge_maps'
    if not os.path.exists(edge_dir):
        os.makedirs(edge_dir)
    # Iterate over training steps.
    for step in range(NUM_STEPS):
        parsing_, scores, edge_, _ = sess.run([pred_all, pred_scores, pred_edge, update_op])
        if step % 100 == 0:
            print('step {:d}'.format(step))
            print (image_list[step])
        img_split = image_list[step].split('/')
        img_id = img_split[-1][:-4]

        msk = decode_labels(parsing_, num_classes=N_CLASSES)
        parsing_im = Image.fromarray(msk[0])
        parsing_im.save('{}/{}_vis.png'.format(parsing_dir, img_id))
        cv2.imwrite('{}/{}.png'.format(parsing_dir, img_id), parsing_[0,:,:,0])
        sio.savemat('{}/{}.mat'.format(parsing_dir, img_id), {'data': scores[0,:,:]})

        cv2.imwrite('{}/{}.png'.format(edge_dir, img_id), edge_[0,:,:,0] * 255)

we write the result into '.png' 'vis.png' 'mat' formats. But this repo doesn't show which informations should be taken. And in the readme.md, 'Run semantic segmentation on images. We used PGN semantic segmentation and manual correction. Segment garments, Pants (65, 0, 65), Short-Pants (0, 65, 65), Shirt (145, 65, 0), T-Shirt (145, 0, 65) and Coat (0, 145, 65).' is a little confusing.

Frank-Dz commented 4 years ago

I ran PGN on my cropped image: image and got segmentation result: 023622.mat 023622.png 023622_vis.png

The following is 023622_vis.png: image

The following is 023622.mat: image

Then how can we build data in the test_data.pkl for image_0 (suppose this is the firt image)? Since I noticed that the image_x in test_data.pkl is image

The color is quite different.

Thanks again!

Best, Frank

LiuYuZzz commented 4 years ago

J_x is used during training to supervise 3D joints. This is not used during inference.

But when I run the find_tune function in test_network.py with my own data, it cannot be executed without J_x

bharat-b7 commented 4 years ago

J_x is used during training to supervise 3D joints. This is not used during inference.

But when I run the find_tune function in test_network.py with my own data, it cannot be executed without J_x

This is because the loss on joints (here) takes 3D joints and computes the 2D projection in the function. You will have to modify this function to compute projection only for the predicted 3D joints but use the GT 2D joints as is. Also edit line where loss is computed

Frank-Dz commented 4 years ago

J_x is used during training to supervise 3D joints. This is not used during inference.

But when I run the find_tune function in test_network.py with my own data, it cannot be executed without J_x

This is because the loss on joints (here) takes 3D joints and computes the 2D projection in the function. You will have to modify this function to compute projection only for the predicted 3D joints but use the GT 2D joints as is.

Hi~How did you get the J_x? Just use the SMPL's function to get its joints? Thanks!

bharat-b7 commented 4 years ago

J_x is used during training to supervise 3D joints. This is not used during inference.

But when I run the find_tune function in test_network.py with my own data, it cannot be executed without J_x

This is because the loss on joints (here) takes 3D joints and computes the 2D projection in the function. You will have to modify this function to compute projection only for the predicted 3D joints but use the GT 2D joints as is.

Hi~How did you get the J_x? Just use the SMPL's function to get its joints? Thanks!

yes J_x are SMPL joints in x-th frame.

LiuYuZzz commented 4 years ago

J_x is used during training to supervise 3D joints. This is not used during inference.

But when I run the find_tune function in test_network.py with my own data, it cannot be executed without J_x

This is because the loss on joints (here) takes 3D joints and computes the 2D projection in the function. You will have to modify this function to compute projection only for the predicted 3D joints but use the GT 2D joints as is.

Hi~How did you get the J_x? Just use the SMPL's function to get its joints? Thanks!

SMPL function?Is used to generate J_x, or used to process GT 2D joints?

LiuYuZzz commented 4 years ago

J_x is used during training to supervise 3D joints. This is not used during inference.

But when I run the find_tune function in test_network.py with my own data, it cannot be executed without J_x

This is because the loss on joints (here) takes 3D joints and computes the 2D projection in the function. You will have to modify this function to compute projection only for the predicted 3D joints but use the GT 2D joints as is. Also edit line where loss is computed

So J_x = J_2d_x? Because I found that the ytrue is not processed in the reprojection function.

def reprojection(ytrue, ypred):
    b_size = tf.shape(ypred)[0]
    projection_matrix = perspective_projection(FOCAL_LENGTH, CAMERA_CENTER, IMG_SIZE, IMG_SIZE, .1, 10)
    projection_matrix = tf.tile(tf.expand_dims(projection_matrix, 0), (b_size, 1, 1))

    ypred_h = tf.concat([ypred, tf.ones_like(ypred[:, :, -1:])], axis=2)
    ypred_proj = tf.matmul(ypred_h, projection_matrix)
    ypred_proj /= tf.expand_dims(ypred_proj[:, :, -1], -1)

    return K.mean(K.square((ytrue[:, :, :2] - ypred_proj[:, :, :2]) * tf.expand_dims(ytrue[:, :, 2], -1)))
bharat-b7 commented 4 years ago

@LiuYuZzz this should work. Can you try this out and report if this works? Thanks.

LiuYuZzz commented 4 years ago

@LiuYuZzz this should work. Can you try this out and report if this works? Thanks.

@bharat-b7 Yes, I have experimented once and can get better results, thank you for your help!

xiezhy6 commented 4 years ago

I ran PGN on my cropped image: image and got segmentation result: 023622.mat 023622.png 023622_vis.png

The following is 023622_vis.png: image

The following is 023622.mat: image

Then how can we build data in the test_data.pkl for image_0 (suppose this is the firt image)? Since I noticed that the image_x in test_data.pkl is image

The color is quite different.

Thanks again!

Best, Frank

Hi, Have you solved this problem? The image_x seems like a SMPL-based segmentation. How can we obtain such SMPL-based segmentation when using our own data? The head of our own person image is quit different from the head in the SMPL model, since our own person image usually have hair.

Another question is that does it necessary to set the value of the head, hands and feet in the segmentation to 255? I find these regions in the sample segmentation are all set to 255. Maybe the author can give me more details about how to obtain the segmentaion. @bharat-b7

Look forward to your reply. Thanks!

bharat-b7 commented 4 years ago

Please see the data processing steps in the readme. Once you get the segmentation labels from PGN, you just need to change the colours as follows: Pants (65, 0, 65), Short-Pants (0, 65, 65), Shirt (145, 65, 0), T-Shirt (145, 0, 65) and Coat (0, 145, 65), skin and hair are set as white. The colour choice was arbitrarily decided when training MGN, nothing technical about it.

xiezhy6 commented 4 years ago

Please see the data processing steps in the readme. Once you get the segmentation labels from PGN, you just need to change the colours as follows: Pants (65, 0, 65), Short-Pants (0, 65, 65), Shirt (145, 65, 0), T-Shirt (145, 0, 65) and Coat (0, 145, 65), skin and hair are set as white. The colour choice was arbitrarily decided when training MGN, nothing technical about it.

Thanks for your reply! It helps me a lot.

Now, I have another question about the "J_2d_x" in the test_data.pkl. As you mentioned above, the "J_2d_x" can be obtained by openpose. However, after I ran openpose test script on my own data, I got a list with length 75 for the 2d keypoints. Then I transformed it into a (25,3) array. Each tuple represents the x coordinate, y coordinate, and the prediction confidence of each estimated keypoint. However, though the shape of "J_2d_x" is (2,25,3), the range of values is quit different from that in the openpose predicted results. It seems like "J_2d_x" does not represent the x coordinate, y coordiante, and confidence of each keypoint anymore. So, how can I transform the openpose predicted results to the data format of "J_2d_x" in test_data.pkl

Sorry to bother you again. Look forward to you reply~ Thanks again!

imbinwang commented 4 years ago

Hi @xiezhy6

I have had the same problem, and solved it by setting openpose flag --keypoint_scale 4. You could refer to the openpose code.

lafith commented 3 years ago

Please see the data processing steps in the readme. Once you get the segmentation labels from PGN, you just need to change the colours as follows: Pants (65, 0, 65), Short-Pants (0, 65, 65), Shirt (145, 65, 0), T-Shirt (145, 0, 65) and Coat (0, 145, 65), skin and hair are set as white. The colour choice was arbitrarily decided when training MGN, nothing technical about it.

Hi @bharat-b7, Image below shows the labels as per PGN repo. Which labels corresponds to the labels mentioned above. Screenshot from 2020-10-09 16-53-37