Event-AHU / EventVOT_Benchmark

[CVPR-2024] The First High Definition (HD) Event based Visual Object Tracking Benchmark Dataset
52 stars 1 forks source link

About the training of the teacher network #9

Closed yuyangpoi closed 2 months ago

yuyangpoi commented 2 months ago

The training of teacher networks requires the use of image frame data, but the EventVOT dataset does not include images. How to train the teacher network in this case?

wsasdsda commented 2 months ago

The training of teacher networks requires the use of image frame data, but the EventVOT dataset does not include images. How to train the teacher network in this case?

Hello, thank you for your question. When using EventVOT dataset in training, we used multi-view data as input. specifically,we converted the original CSV data of EventVOT into Event images and Voxel to build the two-branch inputs of teacher model.

yuyangpoi commented 2 months ago

Thank you for your response. So, when training the teacher network using the EventVOT dataset, instead of using color image data and event data as inputs, the network use two representations of events as inputs. Whereas, when training with other datasets containing image frames like VisEvent, the network use color image data and events as inputs, right?

wsasdsda commented 2 months ago

Thank you for your response. So, when training the teacher network using the EventVOT dataset, instead of using color image data and event data as inputs, the network use two representations of events as inputs. Whereas, when training with other datasets containing image frames like VisEvent, the network use color image data and events as inputs, right?

Yes, that's right!

yuyangpoi commented 2 months ago

Thanks!