microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.64k stars 2.51k forks source link

Beitv2's self-supervised pre-training is slow #1368

Open zhanglaoban-kk opened 10 months ago

zhanglaoban-kk commented 10 months ago

I have 16000 images on my unlabeled data, the batch_size is set to 32, and it takes almost 20 minutes to train an epoch, what is the reason for that

pengzhiliang commented 10 months ago

@zhanglaoban-kk Thanks for your interest.

Batchsize 32 is the total batch size? If not, how about the total batchsize?

zhanglaoban-kk commented 10 months ago

here are the parameter sizes I set : def get_args():
parser = argparse.ArgumentParser('BEiT pre-training script', add_help=False) parser.add_argument('--batch_size', default=32, type=int) parser.add_argument('--epochs', default=500, type=int) parser.add_argument('--save_ckpt_freq', default=100, type=int)

pengzhiliang commented 10 months ago

ic, how many gpu cards do you use?

zhanglaoban-kk commented 10 months ago

Only used one A40

pengzhiliang commented 10 months ago

Ok, we can provide a reference point. It costs about 5 min with 16 V100s on ImageNet-1M (1.28M images), in our experiments.