NVlabs / DG-Net

:couple: Joint Discriminative and Generative Learning for Person Re-identification. CVPR'19 (Oral) :couple:
https://www.zdzheng.xyz/publication/Joint-di2019
Other
1.27k stars 230 forks source link

Some questions about the appearance space and the structure space #37

Open SY-Xuan opened 4 years ago

SY-Xuan commented 4 years ago

Hello, In your paper, I think if use appearance space of id I and structure space of id J to generate an image. The ID of generated image should be J. So I think the structure space encodes the ID information. This loss use the ID of appearance space to determine the ID of generated image. image

But this loss use the ID of structure space to determine the ID of generated image. image

And you also mentioned in the paper that

When training on images organized in this manner, the discriminative module is forced to learn the fine-grained id-relate attributes (such as hair, hat, bag, body size, and so on) that are independent to clothing.

And obviously the structure space encodes the hair, hat, bag and body size. Therefore, the sutructure space encodes the ID.

This is confusing to use appearance space to discriminative ID. Could you please explain about this.

Zonsor commented 4 years ago

@BossBobxuan I also have same questions. My hypothesis is we need to make generated image have high predicted probability for the ID from both spaces. The reason is both have id-related information.

layumi commented 4 years ago

Hi @BossBobxuan , @Zonsor Sorry for the late response. Yes. It is possible.

However, we did not do it. The main reason is that the structure space is relatively low-level, which is used to reconstruct the image. Thus, if we want to extract the high-level feature, we need one more res-net, which will introduce extra parameters.

So we conduct a tradeoff that learn the structure info, e.g., hair, hat, bag and body size, on the appearance embedding as well.

RonakDedhiya commented 4 years ago

Hello,

I couldn't find in code how fine grained classification is done? Or how structure info is learnt in appearance embedding?

layumi commented 4 years ago

Hello,

I couldn't find in code how fine grained classification is done? Or how structure info is learnt in appearance embedding?

We just use two classifiers, which do not share weights. https://github.com/NVlabs/DG-Net/blob/master/reIDmodel.py#L145-L146

RonakDedhiya commented 4 years ago

So the ft_netAB has a shared base parameters for two purposes:

  1. learning f - as appearance information
  2. learning p - which does reID learning.

I have a naive doubt that whether base parameters will be a tradeoff between having nice appearance information and also having discriminative reID information. Can we use different model for learning p?

layumi commented 4 years ago

Yes. The generation somehow affect the reID. So I applied the detach at https://github.com/NVlabs/DG-Net/blob/master/reIDmodel.py#L142
Sure. You could use different model to learning p.

RonakDedhiya commented 4 years ago

Thanks for your quick response. Your work is quite inspiring and I am learning a lot from your paper.

SY-Xuan commented 4 years ago

Hi @BossBobxuan , @Zonsor Sorry for the late response. Yes. It is possible.

However, we did not do it. The main reason is that the structure space is relatively low-level, which is used to reconstruct the image. Thus, if we want to extract the high-level feature, we need one more res-net, which will introduce extra parameters.

So we conduct a tradeoff that learn the structure info, e.g., hair, hat, bag and body size, on the appearance embedding as well.

Thanks for your response.