Closed alice-cool closed 3 years ago
Dear scholar, May I ask you the meaning of the parameter " nongt_dim" in your code? sometimes the value is 20 or 36
Some words is vague. I found that in the paper, it has "no-relation" type in the spatial relation encoder. But in the code ,it didn't include the type.
Dear scholar, Does your code include the all of three relation inference code. I found that in your code ,you said the three relation model is trained independently. And only in the inference time, you will balance the three models' results to get the probabilities of a certain predicted answer
Where to encode the explicit relation type into the model. I found nothing. If the label bias is from adj_list , it could no help because you create the matrix only transpose(num_roi, num_roi ) and the label_num didn't vary and this only represents the edge' existence. But it didn't represents the diverse relation type. So I think the code is not complete about the relation label in the semantic and spacial type.????
Dear scholar, I want to ask you whether the dimention of W dir(i,j) is dh×(dq+dv) and the bias of b lab(i,j) is one-hot vector? And I doubt about the meaning of W dir(i,j).
The bias b lab(i,j) is just a scalar. For W dir(i,j), we neglected multi-head attention in Eq. (8). So it might be a little confusing. W dir(i,j) should be corresponds to the linear_out layer here: https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/model/graph_att_layer.py#L51-L53
We followed the implementation of Relation Networks for Object Detection to implement multi-head attention for better efficiency.
Dear scholar, May I ask you the meaning of the parameter " nongt_dim" in your code? sometimes the value is 20 or 36 https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/main.py#L101-L102
Some words is vague. I found that in the paper, it has "no-relation" type in the spatial relation encoder. But in the code ,it didn't include the type.
In build_graph(), if the relative position between box i and box j) does not fall into any of the if and else statement, then the adj_matrix(i,j) does not receive an label, hence the "no-relation" type. Note that the "no-relation" is only for constructing the spatial graph, the spatial graph attention will not consider this type, as we don't think there is a relation with two objects that are too far away from each other.
Dear scholar, Does your code include the all of three relation inference code. I found that in your code ,you said the three relation model is trained independently. And only in the inference time, you will balance the three models' results to get the probabilities of a certain predicted answer
We did not include the code to aggregate three models results. The aggregation is very straightforward, we simply take a weighted sum of the logit from each model. I believe we used alpha=0.3 and beta= 0.3.
Where to encode the explicit relation type into the model. I found nothing. If the label bias is from adj_list , it could no help because you create the matrix only transpose(num_roi, num_roi ) and the label_num didn't vary and this only represents the edge' existence. But it didn't represents the diverse relation type. So I think the code is not complete about the relation label in the semantic and spacial type.????
As stated in the comments below, the adj_matix is of shape batch_size, num_rois, num_rois, num_labels]. Therefore, it is a one-hot embedding of the relation labels. https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/model/graph_att.py#L57
The bias layer operates on the last dimension of adj_matrix to learn a bias term for each relation type. https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/model/graph_att.py#L40 https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/model/graph_att.py#L90
Thanks all the time for your timely reply. Thank you. It helps me a lot.
---Original--- From: "Linjie Li"<notifications@github.com> Date: Thu, Jan 21, 2021 14:26 PM To: "linjieli222/VQA_ReGAT"<VQA_ReGAT@noreply.github.com>; Cc: "Author"<author@noreply.github.com>;"Donglearner"<2524981200@qq.com>; Subject: Re: [linjieli222/VQA_ReGAT] Wdir(i,j) in Function 8 in the explicit model (#29)
Dear scholar, I want to ask you whether the dimention of W dir(i,j) is dh×(dq+dv) and the bias of b lab(i,j) is one-hot vector? And I doubt about the meaning of W dir(i,j).
The bias b lab(i,j) is just a scalar. For W dir(i,j), we neglected multi-head attention in Eq. (8). So it might be a little confusing. W dir(i,j) should be corresponds to the linear_out layer here: https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/model/graph_att_layer.py#L51-L53
We followed the implementation of Relation Networks for Object Detection to implement multi-head attention for better efficiency.
Dear scholar, May I ask you the meaning of the parameter " nongt_dim" in your code? sometimes the value is 20 or 36 https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/main.py#L101-L102
Some words is vague. I found that in the paper, it has "no-relation" type in the spatial relation encoder. But in the code ,it didn't include the type.
In build_graph(), if the relative position between box i and box j) does not fall into any of the if and else statement, then the adj_matrix(i,j) does not receive an label, hence the "no-relation" type. Note that the "no-relation" is only for constructing the spatial graph, the spatial graph attention will not consider this type, as we don't think there is a relation with two objects that are too far away from each other.
Dear scholar, Does your code include the all of three relation inference code. I found that in your code ,you said the three relation model is trained independently. And only in the inference time, you will balance the three models' results to get the probabilities of a certain predicted answer
We did not include the code to aggregate three models results. The aggregation is very straightforward, we simply take a weighted sum of the logit from each model. I believe we used alpha=0.3 and beta= 0.3.
Where to encode the explicit relation type into the model. I found nothing. If the label bias is from adj_list , it could no help because you create the matrix only transpose(num_roi, num_roi ) and the label_num didn't vary and this only represents the edge' existence. But it didn't represents the diverse relation type. So I think the code is not complete about the relation label in the semantic and spacial type.????
As stated in the comments below, the adj_matix is of shape batch_size, num_rois, num_rois, num_labels]. Therefore, it is a one-hot embedding of the relation labels. https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/model/graph_att.py#L57
The bias layer operates on the last dimension of adj_matrix to learn a bias term for each relation type. https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/model/graph_att.py#L40 https://github.com/linjieli222/VQA_ReGAT/blob/9f6fe5bcda169c268eb1c92ef00df9f61d540081/model/graph_att.py#L90
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Dear scholar, I want to ask you whether the dimention of W dir(i,j) is dh×(dq+dv) and the bias of b lab(i,j) is one-hot vector? And I doubt about the meaning of W dir(i,j).