Closed yuyangpoi closed 2 months ago
The training of teacher networks requires the use of image frame data, but the EventVOT dataset does not include images. How to train the teacher network in this case?
Hello, thank you for your question. When using EventVOT dataset in training, we used multi-view data as input. specifically,we converted the original CSV data of EventVOT into Event images and Voxel to build the two-branch inputs of teacher model.
Thank you for your response. So, when training the teacher network using the EventVOT dataset, instead of using color image data and event data as inputs, the network use two representations of events as inputs. Whereas, when training with other datasets containing image frames like VisEvent, the network use color image data and events as inputs, right?
Thank you for your response. So, when training the teacher network using the EventVOT dataset, instead of using color image data and event data as inputs, the network use two representations of events as inputs. Whereas, when training with other datasets containing image frames like VisEvent, the network use color image data and events as inputs, right?
Yes, that's right!
Thanks!
The training of teacher networks requires the use of image frame data, but the EventVOT dataset does not include images. How to train the teacher network in this case?