basetenlabs / truss

The simplest way to serve AI/ML models in production
https://truss.baseten.co
MIT License
892 stars 64 forks source link

Error out locally when using FP8 quant with non-compatible hardware #1012

Open aspctu opened 3 months ago

aspctu commented 3 months ago

:rocket: What

We want to error out locally when trying to build with FP8 but with non-FP8 compatible architecture.

:computer: How

:microscope: Testing