whwu95 / Text4Vis

【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
MIT License
204 stars 15 forks source link

Data prep and training time #2

Closed bashimr closed 1 year ago

bashimr commented 1 year ago

@whwu95

I have setup an 8 GPU instance on AWS and I am trying to download the Kinetics400 dataset. A couple of questions:

1) Approximately how long it takes to prepare the dataset for training i.e. extracting and resizing the frames? 2) Approximately how long it will take to train the model on Kinetics400 dataset. The machine specs are given below?

Compute Value
vCPUs 96
Memory (GiB) 384.0
Memory per vCPU (GiB) 4.0
Physical Processor Intel Xeon Family
Clock Speed (GHz) 2.5
CPU Architecture x86_64
GPU 8
GPU Architecture nvidia t4 tensor core
Video Memory (GiB) 128
GPU Compute Capability (?) 7.5

Thanks,

Imran

whwu95 commented 1 year ago

Thank you for reaching out and raising these issues. In response to your concerns, we want to provide some information that might be helpful:

  1. On our machine with an Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz, processing all K400 videos (training and validation sets totaling around 260k videos) and resizing them takes approximately 2-3 hours.

  2. On our 8-card A100 machine, training the ViT-B/16 8-frame model for 30 epochs takes about 4 hours.

We hope this information helps.

bashimr commented 1 year ago

Thank you @whwu95