Closed mujavidb closed 9 months ago
Thank you for your interest in our work.
The AVA dataset has provided entity information for each face.
As for the Columbia dataset, you can refer to the 'track_shot
' function in line 125 of the Columbia_test.py
file.
Very helpful thanks.
I can see in your evaluation code that for each detected entity in a video, you capture the face data for each frame, and then you pass these into the ASD detector on an entity by entity level.
How are you categorising faces by entity? I can see face detection is done with
SF3D
. But whenload_visual
is called the faces are already segmented by entity. So, at inference, each iteration of the data loaderval_loader
is done on a face-by-face level. It is unclear how categorisation is done by entity in this example code.