Victorwz / VaLM

VaLM: Visually-augmented Language Modeling. ICLR 2023.
https://openreview.net/forum?id=8IN-qLkl215
55 stars 3 forks source link

Training Resource Details #2

Open 1024er opened 1 year ago

1024er commented 1 year ago

Hi, There are two training resource details I would like to ask you: (1) How much disk space is required to store the features of 200M images? (2) How many hours is the training using 16 V100s ?

Thank you

1024er commented 1 year ago

@Victorwz

Victorwz commented 1 year ago

Hi, Thank you so much for your interest in our work. 1) The feature vectors with dimension 768 take up 274GB disk space, and the trained faiss index takes up another 14GB; 2) The training takes about 6 days with 16 v100 GPUs.