minar09 / ACGPN

"Towards Photo-Realistic Virtual Try-On by Adaptively Generating↔Preserving Image Content",CVPR 2020. (Modified from original with fixes for inference)
https://github.com/switchablenorms/DeepFashion_Try_On
77 stars 70 forks source link

Training problem #2

Open abc123yuanrui opened 4 years ago

abc123yuanrui commented 4 years ago

Thanks heaps for your good working replicating ACGPN inference model. One problem I am working on is that when I train it with VITON dataset (lip segmentation inputs with 20 labels), I found:

  1. The train.py forward pass inputs are different from their original model. Is there a specific reason for this change (one input data['image']  removed)? image

  2. When I use lip dataset for training, I ran into this error: image

Based on this report: https://github.com/NVlabs/SPADE/issues/57, it should be a channel mismatch problem. However, I believe all relevant channels had been mapped to LIP except channel 7 (noise channel). The thing is that there is no counterpart in LIP label for this channel (I used a coat channel but not work apparently). Any recommendation for fixing this issue?

minar09 commented 4 years ago

Hi @abc123yuanrui , thanks for your kind words. I didn't train this ACGPN end-to-end because my pc doesn't have that kind of memory. So, I didn't make any changes to the training script. Therefore, if there are any discrepancies between the train.py and the model input, you can update and run as you see fit (also welcome to submit a pull request).

Also, please see this issue https://github.com/minar09/ACGPN/issues/1#issuecomment-695799359, regarding training with a different segmentation, because ACGPN authors used a 14-labels segmentation. So, you may have to update the input/output channels for the networks.

And for noise, they mentioned that you can ignore it (https://github.com/switchablenorms/DeepFashion_Try_On/issues/15#issuecomment-662295710). So if its giving error, you may remove and train. Hope that helps. Thank you.

abc123yuanrui commented 4 years ago

Thanks a lot for your instance response!!

I had changed both number of channels label_nc and all used channel numbers (let and right arms, upper cloth, etc) when I ran the training process. It did take me a long time to figure out where this error comes from in pix2pix_model. The reason was I used CP-VTON-PLUS segmentation as inputs, so the real number of labels is 21, which does not match with 20.

Thank you for the clarification on training model and the difference of segmentation inputs.

minar09 commented 4 years ago

@abc123yuanrui , you are welcome. Good to know that you solved the issue. Good luck.

abc123yuanrui commented 4 years ago

Another thing I think I should mention is that their network (pix2pixHD.py and networks.py) are different for training and inference mode. That's why the original repo uses independent folders saving the networks even they share the same name, which also cause confusions, especially they both have train.py and test.py in the old version. Henceforth, the train.py here won't work as it needs the "training version" models plus VGG19 checkpoints.

minar09 commented 4 years ago

@abc123yuanrui thanks for the information. It will surely be helpful.

daxjain789 commented 4 years ago

i think this code will help you out for re-arranging the label according to paper/repo

test = np.zeros(shape=img2.shape) for i in range(img2.shape[0]): for j in range(img2.shape[1]):

if img2[i][j]==1 or img2[i][j]==2:  # pixel 1 = Hair, hat
  test[i][j] = 1
elif img2[i][j]==5 or img2[i][j]==10 or img2[i][j]==6 or img2[i][j]==7:  #pixel 4 = uppercloth, dress, coat, Jumpsuits
  test[i][j] = 4
elif img2[i][j] == 18:   # pixel 5 = left-shoe
  test[i][j] = 5
elif img2[i][j] == 19:   # pixel 6 = right-shoe
  test[i][j] = 6
# pixel 7 = noise but dont define
elif img2[i][j]==9 or img2[i][j]==12 or img2[i][j]==8:    # pixel 8 = pent, socks, Skirt
  test[i][j] = 8
elif img2[i][j] == 16:     # pixel 9 = left-leg
  test[i][j] = 9
elif img2[i][j] == 17:     # pixel 10 = right-leg
  test[i][j] = 10
elif img2[i][j] == 14:     # pixel 11 = left-arm   
  test[i][j] = 11
elif img2[i][j] == 13 or img2[i][j] == 4:     # pixel 12 = face, sunglasses
  test[i][j] = 12
elif img2[i][j] == 15:     # pixel 13 = right-arm
  test[i][j] = 13
else:     # pixel 0 = background, glove, Scarf
  test[i][j]=0
minar09 commented 4 years ago

This code will help you out for re-arranging the labels according to the paper or repo

test = np.zeros(shape=img2.shape) for i in range(img2.shape[0]): for j in range(img2.shape[1]): if img2[i][j]==1 or img2[i][j]==2: # pixel 1 = Hair, hat test[i][j] = 1 elif img2[i][j]==5 or img2[i][j]==10 or img2[i][j]==6 or img2[i][j]==7: #pixel 4 = uppercloth, dress, coat, Jumpsuits test[i][j] = 4 elif img2[i][j] == 18: # pixel 5 = left-shoe test[i][j] = 5 elif img2[i][j] == 19: # pixel 6 = right-shoe test[i][j] = 6 # pixel 7 = noise but dont define elif img2[i][j]==9 or img2[i][j]==12 or img2[i][j]==8: # pixel 8 = pent, socks, Skirt test[i][j] = 8 elif img2[i][j] == 16: # pixel 9 = left-leg test[i][j] = 9 elif img2[i][j] == 17: # pixel 10 = right-leg test[i][j] = 10 elif img2[i][j] == 14: # pixel 11 = left-arm test[i][j] = 11 elif img2[i][j] == 13 or img2[i][j] == 4: # pixel 12 = face, sunglasses test[i][j] = 12 elif img2[i][j] == 15: # pixel 13 = right-arm test[i][j] = 13 else: # pixel 0 = background, glove, Scarf test[i][j]=0

@daxjain789, thanks for the code. Yes, it may work. You can also use pillow/numpy to make this re-arranging faster (https://github.com/minar09/ACGPN/issues/1#issuecomment-695875353).