Open austinmw opened 5 years ago
EIA accelerators are just slices of P3 (eia1 accelerators) and G4 instances (eia2 accelerators). So the eia1 accelerators are using Volta architecture while the eia2 accelerators are using Tesla architecture.
FP16 should yield performance improvements with EIA, as well as if you're doing inference on a full GPU. Lower precision is going to be faster as long as your model is converted properly
I can't find much info about EIA architecture. I'd like to know if EIA instances use Volta architecture and therefore have tensor cores to speed up FP16 operations.
Should FP16 inference get speed improvements with EIAs?