Closed karenyun closed 5 years ago
For the loss_fn
in hico.py
, we initially experimented with the given mse_loss
. In the end we did not use that. Instead, we wrote our own loss.
For the second problem, that was a mistake in this version. A line of hidden_node_states[passing_round+1][:, :, i_node] = h_v
need to be added to the model. We found that adding this line of code actually improves the performance, e.g., for detection in CAD-120, the F1 score goes from 88.9 to 91.3 for subactivity, and 88.8 to 91.6 for affordance.
Hi, thanks for the quick reply. For the first problem, I didn't find that you compute the L1 loss for the 'adj_matrix' in the 'hico.py'?
Hi, @karenyun , for the first problem, I still don`t know how to detect the human and object pair without L1 loss for the 'pred_adj_matrix' and 'adj_matrix'? Do I misunderstand something? Could you give me some advice? Thank you very much!
Hi, @Gaoyiminggithub , sorry I am also confused about that. The paper states that there has an L1 loss for the adj_matrix, while there is nothing in the code.
@SiyuanQi Hi, could you give us some advice about the L1 loss for the adj_matrix? Thank you very much!
@karenyun @Gaoyiminggithub The adjacency matrix is pretrained in another file: https://github.com/SiyuanQi/gpnn/blob/master/src/python/hico_graph.py#L128-L130
@SiyuanQi Thanks for the quick reply.
Does It mean that we need to run the python hico_graph.py
to train the adjacency matrix, and then run the python hico.py
?
I think I misunderstand something. If you could give some advice, that would be great. Thank you very much!!!!!!!!
@Gaoyiminggithub Yes you would need to run python hico_graph.py
first. Then python hico.py
will load the pre-trained model.
@SiyuanQi Thank you for the quick reply!
@SiyuanQi Thanks very much! Could you give more detail instruments in Readme when you are available??
@SiyuanQi Thanks very much! I still have some questions.
I have tried to run python hico_graph.py
first and then run python hico.py
. Comparing with just running python hico.py
, the result is comparable (mAP 11.5 vs 11.6 on the HICO-DET test set). It shows that with the adjacency matrix pretrained doesn`t help the final results.
Could you give more advice about that? Thank you very much!
(PS: I run the code with validation_ratio = 0.25)
@Gaoyiminggithub That seems lower than expected. Try to add the graph loss to hico.py
as well, that should be very helpful. That's the way it was originally implemented but seems like I removed that for ablation and forgot to add it back.
Make sure you have a good link function and make sure the pre-trained link function is correctly loaded here.
Hi, @SiyuanQi , sorry I still have some questions.
Do you mean the 'graph loss' is the 'L1 loss' for the 'adj_matrix'? I did not find you compute this loss in the 'hico.py' and 'hico_graph.py'?
And why do we have to train the 'hico_graph.py' first for the 'pretrained_adj_matrix' rather training from scratch? In other words, I mean the 'adj_matrix' learned from the model that can represent the probability of two nodes adjacent or not, so what's the purpose of obtaining the 'pretrained_adj_matrix' ???
Sorry to bother you again!
@karenyun 1. You can use whatever loss you feel appropriate to train the link function. 2. Pretraining the link function makes the joint training easier.
@SiyuanQi Thanks!😂I may be confused about something, now I got it. Thanks very much!
Hi @SiyuanQi ,
I still have some problems when reading the code, could you give me some suggestions when you are available? Thanks very much!
What the difference between the 'det_features' and the 'bbox_features'. 1) Does the 'det_features' are outputs from the RoIPooling? 2) And the 'bbox_features' are generated from the 'resnet' or 'vgg'?
Why do you want to separate position for the 'object features' and the 'human features' in the 'node_features'?
The ‘edge_features’ seem to do not contain the features between human nodes? While there exist 'person-person' interaction classes?
In the function 'parse_classes' of the file 'vcoco/parse_features.py', how do you make sure 'edge_num = det_classes.shape[0] - node_num'?
The 1st and 2nd dimensions of 'edge_features' are the same as the size of 'adj_mat' according to the 'parse_features.py', but I find one sample that they are not same, I am very confused😂~
Hi @SiyuanQi ,
I still have some problems when reading the code, could you give me some suggestions when you are available? Thanks very much!
What the difference between the 'det_features' and the 'bbox_features'.
- Does the 'det_features' are outputs from the RoIPooling?
- And the 'bbox_features' are generated from the 'resnet' or 'vgg'?
- Why do you want to separate position for the 'object features' and the 'human features' in the 'node_features'?
- The ‘edge_features’ seem to do not contain the features between human nodes? While there exist 'person-person' interaction classes?
- In the function 'parse_classes' of the file 'vcoco/parse_features.py', how do you make sure 'edge_num = det_classes.shape[0] - node_num'?
- The 1st and 2nd dimensions of 'edge_features' are the same as the size of 'adj_mat' according to the 'parse_features.py', but I find one sample that they are not same, I am very confused😂~
Your adj_mat in 59691 while edge_feature in 59692, I think it's why they can't match.
Hi, @SiyuanQi , thanks for the excellent work. When I read the function "loss_fn" in the 'hico.py', I find that though you have given the 'mse_loss', you did not calculate it in this function?
And in the function 'train()', I think the 'hidden_node_states' need to be updated since the messages learned from linked function according to the paper?
Do I misunderstand? Could you give me some advice? Thanks very much!