HaoYang0123 / Position-Focused-Attention-Network

Position Focused Attention Network for Image-Text Matching
68 stars 16 forks source link

How do you get the position information for images? #1

Open kywen1119 opened 5 years ago

kywen1119 commented 5 years ago

Hi! What a great work. Could you tell me how did you get the pre-trained position information for images?Thansk a lot!

HaoYang0123 commented 5 years ago

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

LgQu commented 4 years ago

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hi HaoYang, I wonder that how do you get the raw region information(x, y, w, h). Did you run Faster rcnn by yourself or get it from other sources? How to align the region information with the precomputed features from SCAN? Thank you very much.

weiyunfei commented 4 years ago

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hello, Hao Yang. Thanks for your excellent work and code. However, I have some questions about the paper and code. Firstly, I wonder how you transfered the coordinates to 15 dims like your comments describing in model_attention.py, as I know, the coordinates of a box should be 4-dim. For another, in your paper, you suggest that equally the whole image 𝐼 is splited into 𝐾×𝐾 blocks 𝐵, but I have not find this part in your code release.

Liquor520 commented 2 years ago

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hi HaoYang, I wonder that how do you get the raw region information(x, y, w, h). Did you run Faster rcnn by yourself or get it from other sources? How to align the region information with the precomputed features from SCAN? Thank you very much.

hello,I also have such doubts. Have you solved them? If it is settled, I wonder if you can leave a contact way to discuss it?

Looking forward to receiving your reply

Liquor520 commented 2 years ago

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hello, Hao Yang. Thanks for your excellent work and code. However, I have some questions about the paper and code. Firstly, I wonder how you transfered the coordinates to 15 dims like your comments describing in model_attention.py, as I know, the coordinates of a box should be 4-dim. For another, in your paper, you suggest that equally the whole image 𝐼 is splited into 𝐾×𝐾 blocks 𝐵, but I have not find this part in your code release.

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hello, Hao Yang. Thanks for your excellent work and code. However, I have some questions about the paper and code. Firstly, I wonder how you transfered the coordinates to 15 dims like your comments describing in model_attention.py, as I know, the coordinates of a box should be 4-dim. For another, in your paper, you suggest that equally the whole image 𝐼 is splited into 𝐾×𝐾 blocks 𝐵, but I have not find this part in your code release.

hello,I also have such doubts. Have you solved them? If it is settled, I wonder if you can leave a contact way to discuss it?

Looking forward to receiving your reply!