rapidsai / deployment

RAPIDS Deployment Documentation
https://docs.rapids.ai/deployment/stable/
9 stars 28 forks source link

Update TCO benchmarks with A100s (or H100s) #280

Closed jacobtomlinson closed 11 months ago

jacobtomlinson commented 1 year ago

In #252 the TCO benchmarking notebook was added wth results using V100 GPUs. We need to run the benchmarks again on an A100 instance and update the results. Or even better if we can get H100 instances then we should be using those, but I know capacity is super limited right now.

If A100/H100 capacity on AWS is hard in all regions then we should switch to a different cloud that does have capacity (although we would need to rerun the CPU benchmarks too on a comparable instance on that cloud).

skirui-source commented 1 year ago

The latest benchmarks feature results from A100 gpus (tested in Azure VM)

skirui-source commented 11 months ago

@jacobtomlinson Should we close this issue, as I have the latest A100 GPU benchmark results (Azure) and the corresponding CPU instance?

jacobtomlinson commented 11 months ago

Sure!