Closed xieyuhaoli closed 3 years ago
Yes, as stated in the paper we select 7 heat maps of size 56 by 56 corresponding to the 7 most prominent key-points given the orientation of the vehicle, mount a gaussian kernel to dilate the heat maps, concatenate them with the 56 by 56 intermediate feature maps of the resnet model, and pass them to the remaining resnet blocks.
Yes, I selected seven channels for feature connection according to the output direction results corresponding to different channels. I would like to know the details of your training. Baseline trained 20 epochs, how many times did you train plus key points?
------------------ 原始邮件 ------------------ 发件人: "Pirazh/Vehicle_Key_Point_Orientation_Estimation" @.>; 发送时间: 2021年3月12日(星期五) 凌晨0:05 @.>; @.**@.>; 主题: Re: [Pirazh/Vehicle_Key_Point_Orientation_Estimation] About the ReID code (#10)
Yes, as stated in the paper we select 7 heat maps of size 56 by 56 corresponding to the 7 most prominent key-points given the orientation of the vehicle, mount a gaussian kernel to dilate the heat maps, concatenate them with the 56 by 56 intermediate feature maps of the resnet model, and pass them to the remaining resnet blocks.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Note that the key-point and orientation estimation section of AAVER is trained separately. Once you trained the 2steps of keypoint and orientation estimation module, you freeze its weights. This module performs in inference mode while training the rest of AAVER. Next, the baseline, which is the first path of AAVER is pertained on CompCars dataset and then is fine-tuned on target datasets (either Veri or Vehicle ID) for 20 epochs. After all this, you put the baseline and keypoint estimation modules in the AAVER pipeline and train the orientation conditioned feature extraction branch.
You can find all the training details in the implementation details section.
Since there is no more discussion here, I close this issue.
Excuse me, are you sure that the lower part of your branch only concat the seven channels of the key point output?Why is my accuracy lower than the baseline in the reproduction process?Do you know why?