question about code ‘sentence = klines_pos + enc_output[:,:,0,:].transpose(1,2)’

yosungho / LineTR

Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization (Line Transformer)

Other

238 stars 35 forks source link

question about code ‘sentence = klines_pos + enc_output[:,:,0,:].transpose(1,2)’ #16

Open atomishcv opened 1 year ago

atomishcv commented 1 year ago

that is a good job! i have a question about code ‘sentence = klines_pos + enc_output[:,:,0,:].transpose(1,2)’ in the line_transformer.py's keylineencode. after the transformer encode the 'des', it out the dimension (1, 642,22,256). add the kline_poss(1, 642, 256). why you just take the third dimension, not like the maxpooling. i know the 22(21 + 1) represent the point on line.