Open WangLabTHU opened 4 months ago
We recommend A100 or H100s.
For inference you can use a single GPU.
For finetuning it's more complicated. It depends on the sequence length and how long you want to train. For 8k+ sequence, you'll likely need 8+ gpus, and need model sharding to split the model across gpus to get it to fit in memory. There's a somewhat steep learning curve for these large models, and so we focused instead on making lots of web APIs and wrappers available for people. At some point we hope to have support for finetuning through Together infrastructure.
I wish to ask how much GPU memory is used by the models "evo-1-8k-base" and "evo-1-131k-base"? Could you consider two situations, inference and fine-tuning? Do you have any suggested graphics cards to run Evo?