SiyuanQi-zz / gpnn

Code for ECCV2018 - Learning Human-Object Interactions by Graph Parsing Neural Networks.
225 stars 53 forks source link

MSE Loss #18

Closed karenyun closed 5 years ago

karenyun commented 5 years ago

Hi, @SiyuanQi , thanks for the excellent work. When I read the function "loss_fn" in the 'hico.py', I find that though you have given the 'mse_loss', you did not calculate it in this function?

And in the function 'train()', I think the 'hidden_node_states' need to be updated since the messages learned from linked function according to the paper?

Do I misunderstand? Could you give me some advice? Thanks very much!

SiyuanQi-zz commented 5 years ago

For the loss_fn in hico.py, we initially experimented with the given mse_loss. In the end we did not use that. Instead, we wrote our own loss.

For the second problem, that was a mistake in this version. A line of hidden_node_states[passing_round+1][:, :, i_node] = h_v need to be added to the model. We found that adding this line of code actually improves the performance, e.g., for detection in CAD-120, the F1 score goes from 88.9 to 91.3 for subactivity, and 88.8 to 91.6 for affordance.

karenyun commented 5 years ago

Hi, thanks for the quick reply. For the first problem, I didn't find that you compute the L1 loss for the 'adj_matrix' in the 'hico.py'?

Gaoyiminggithub commented 5 years ago

Hi, @karenyun , for the first problem, I still don`t know how to detect the human and object pair without L1 loss for the 'pred_adj_matrix' and 'adj_matrix'? Do I misunderstand something? Could you give me some advice? Thank you very much!

karenyun commented 5 years ago

Hi, @Gaoyiminggithub , sorry I am also confused about that. The paper states that there has an L1 loss for the adj_matrix, while there is nothing in the code.

Gaoyiminggithub commented 5 years ago

@SiyuanQi Hi, could you give us some advice about the L1 loss for the adj_matrix? Thank you very much!

SiyuanQi-zz commented 5 years ago

@karenyun @Gaoyiminggithub The adjacency matrix is pretrained in another file: https://github.com/SiyuanQi/gpnn/blob/master/src/python/hico_graph.py#L128-L130

Gaoyiminggithub commented 5 years ago

@SiyuanQi Thanks for the quick reply. Does It mean that we need to run the python hico_graph.py to train the adjacency matrix, and then run the python hico.py? I think I misunderstand something. If you could give some advice, that would be great. Thank you very much!!!!!!!!

SiyuanQi-zz commented 5 years ago

@Gaoyiminggithub Yes you would need to run python hico_graph.py first. Then python hico.py will load the pre-trained model.

Gaoyiminggithub commented 5 years ago

@SiyuanQi Thank you for the quick reply!

karenyun commented 5 years ago

@SiyuanQi Thanks very much! Could you give more detail instruments in Readme when you are available??

Gaoyiminggithub commented 5 years ago

@SiyuanQi Thanks very much! I still have some questions. I have tried to run python hico_graph.py first and then run python hico.py. Comparing with just running python hico.py, the result is comparable (mAP 11.5 vs 11.6 on the HICO-DET test set). It shows that with the adjacency matrix pretrained doesn`t help the final results. Could you give more advice about that? Thank you very much! (PS: I run the code with validation_ratio = 0.25)

SiyuanQi-zz commented 5 years ago

@Gaoyiminggithub That seems lower than expected. Try to add the graph loss to hico.py as well, that should be very helpful. That's the way it was originally implemented but seems like I removed that for ablation and forgot to add it back. Make sure you have a good link function and make sure the pre-trained link function is correctly loaded here.

karenyun commented 5 years ago

Hi, @SiyuanQi , sorry I still have some questions.

  1. Do you mean the 'graph loss' is the 'L1 loss' for the 'adj_matrix'? I did not find you compute this loss in the 'hico.py' and 'hico_graph.py'?

  2. And why do we have to train the 'hico_graph.py' first for the 'pretrained_adj_matrix' rather training from scratch? In other words, I mean the 'adj_matrix' learned from the model that can represent the probability of two nodes adjacent or not, so what's the purpose of obtaining the 'pretrained_adj_matrix' ???

Sorry to bother you again!

SiyuanQi-zz commented 5 years ago

@karenyun 1. You can use whatever loss you feel appropriate to train the link function. 2. Pretraining the link function makes the joint training easier.

karenyun commented 5 years ago

@SiyuanQi Thanks!😂I may be confused about something, now I got it. Thanks very much!

karenyun commented 5 years ago

Hi @SiyuanQi ,

I still have some problems when reading the code, could you give me some suggestions when you are available? Thanks very much!

  1. What the difference between the 'det_features' and the 'bbox_features'. 1) Does the 'det_features' are outputs from the RoIPooling? 2) And the 'bbox_features' are generated from the 'resnet' or 'vgg'?

  2. Why do you want to separate position for the 'object features' and the 'human features' in the 'node_features'?

  3. The ‘edge_features’ seem to do not contain the features between human nodes? While there exist 'person-person' interaction classes?

  4. In the function 'parse_classes' of the file 'vcoco/parse_features.py', how do you make sure 'edge_num = det_classes.shape[0] - node_num'?

  5. The 1st and 2nd dimensions of 'edge_features' are the same as the size of 'adj_mat' according to the 'parse_features.py', but I find one sample that they are not same, I am very confused😂~ image

LouisChen0104 commented 5 years ago

Hi @SiyuanQi ,

I still have some problems when reading the code, could you give me some suggestions when you are available? Thanks very much!

  1. What the difference between the 'det_features' and the 'bbox_features'.

    1. Does the 'det_features' are outputs from the RoIPooling?
    2. And the 'bbox_features' are generated from the 'resnet' or 'vgg'?
  2. Why do you want to separate position for the 'object features' and the 'human features' in the 'node_features'?
  3. The ‘edge_features’ seem to do not contain the features between human nodes? While there exist 'person-person' interaction classes?
  4. In the function 'parse_classes' of the file 'vcoco/parse_features.py', how do you make sure 'edge_num = det_classes.shape[0] - node_num'?
  5. The 1st and 2nd dimensions of 'edge_features' are the same as the size of 'adj_mat' according to the 'parse_features.py', but I find one sample that they are not same, I am very confused😂~ image

Your adj_mat in 59691 while edge_feature in 59692, I think it's why they can't match.