wangguanan / AlignGAN

[ICCV2019] RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment
119 stars 18 forks source link

Question of dataset process #5

Closed L-CODERS closed 4 years ago

L-CODERS commented 4 years ago

Thinks for your work. I want to ask why all the images need to convert to RGB includes IR images when loading them. I find it in this line of code: 'Image.open(img_path).convert('RGB')'.

wangguanan commented 4 years ago

This is because the pre-trained ResNet50 take images with 3 channels as input. Thus, we have to pre-process all input images to 3 channels. When loading IR images as 'RGB' format, it naturally has 3 channels.

L-CODERS commented 4 years ago

Thinks for your reply. I find the original IR image has 3 channels although no process of 'convert('RGB')', but the paper of SYSU-MM01 Dataset indicates the channel of IR image only have 1 channel. Do you also find this problem?

wangguanan commented 4 years ago

yes, but I don't think it is a problem. anyway, even the IR images have RGB file format, they still have IR content. how do you think about it?

L-CODERS commented 4 years ago

I also think it's not a problem. It just makes me feel curious since the difference of channel between the real dataset and description of paper. I think maybe the way of saving images makes the channel from 1 to 3.

L-CODERS commented 4 years ago

Hi, I find the tri_loss_1 and tri_loss_2 are seem to calculate a same value:

tri_loss_1 = base.compute_triplet_loss(fake_ir_embedding_list, real_ir_embedding_list, real_ir_embedding_list, rgb_pids, ir_pids, ir_pids) tri_loss_2 = base.compute_triplet_loss(real_ir_embedding_list, fake_ir_embedding_list, fake_ir_embedding_list, rgb_pids, ir_pids, ir_pids)

could you tell me what is the difference?

wangguanan commented 4 years ago

the anchors are different. tri_loss_1 uses rgb( translated to fake_ir) as anchor, ir as positive and negative tri_loss_2 uses ir as anchor, rgb(translated to fake_ir) as positive and negative This is inspired by eq(2), eq(3) from 'Cross-Modality Person Re-Identification with Generative Adversarial Training'-ijcai2018

wangguanan commented 4 years ago

for the channels, maybe you can ask the author of the SYSU-MM01 dataset for more details.

L-CODERS commented 4 years ago

Thinks for your patiently answer and I think my doubt is solved.

wangguanan commented 4 years ago

thanks for your attention to our works