dtc111111 / Multi-Modal-UAV

We win the 1st place in UG2+, a task in CVPR 2024 UAV Tracking and Pose-Estimation Challenge.
MIT License
16 stars 3 forks source link

Question #2

Open zxk1212 opened 1 month ago

zxk1212 commented 1 month ago

Great work! We'd like to cite your paper, but we're facing some issues while trying to reproduce the UAV Type classification process. First, could you explain how the input data preprocessing is handled in YOLOv9? Second, could you please indicate where the code for the UAV Type Classification module is located? I couldn't find the relevant code for input and output around the EfficientNet module. I would greatly appreciate it if you could continue to update the related README documentation.

ShuhongLL commented 1 month ago

We have updated the Readme for the data preprocessing by YOLOV9. The workflow is, first utilizing YOLOV9 to detect the potential UAV target seen in the trajectory, then rank by confidence score and select the top-k keyframe under an interval constraint, and finally crop and save the UAV image for training. YOLOV9 is only employed for UAV detection, further training was conducted on EfficientNet. We are not fine-tuning YOLOV9 in this case due to the small size of the training dataset, which comprises four drone categories with only small visible distinctions through the fish-eye camera. To train EfficientNet, you can run our preprocessing code to target trajectories and train the model in a typical way. The source code for training is currently not released since we are planning to move forward to extend the current solution. An updated code will be released afterwards. Thank you for your patience.