h-zhao1997 / cobra

Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference
MIT License
259 stars 8 forks source link

What is the minimum required GPU for training #5

Open nahidalam opened 7 months ago

nahidalam commented 7 months ago

How much GPU I need for training at a minimum? 8xA100 (40GB) or 8xA100(80GB)?

h-zhao1997 commented 7 months ago

Our experiments used 8xA100 80G. You may consider setting the parameter --model.finetune_per_device_batch_size lower without changing the batch size to train on GPUs with lower memory.

YongLD commented 1 month ago

If using 8x A100 40G, how should the batch size be set for optimal performance? @h-zhao1997