Closed CHNxindong closed 7 months ago
Thank you for your attention. Our approach adopts frame-level training and a training-free strategy for video-level inference. Therefore, the batch size for training the model is set to 4, with 4 images used per training step, and frame-to-frame relationships are not considered. This makes training feasible on a single A100 40G GPU.
I see it. Thanks for your kind response!
Hi authors, thanks for your great work. I have a question about training requirement and duration:
The 'Implementation Details' section mentions that all experiments were conducted using a single A100 40G GPU. Would you confirm wheather the training process, when applied to a dataset comprising 27 hours of data collected from four identities, would require only one A100 40G GPU, albeit with an extended duration of seven days?