Open neurogen-dev opened 1 year ago
thank you. That's something I would love to implement. However, there is probably no machine in my environment where I can try fp8, so it may take some time.
thank you. That's something I would love to implement. However, there is probably no machine in my environment where I can try fp8, so it may take some time.
I have an RX 4090 and if you needed help testing (on Windows 11 or Ubuntu 22.04), I could help you.
Is your feature request related to a problem? Please describe.
TensortRT 8.6.1 update was released and, judging by the list of changes, it added the ability to build with the fp8 flag. Since Cuda 12.1 Ada Lovelays also support FP8, as does Hopper. Perhaps being able to build with FP8 would give good acceleration for these GPUs.
Describe the solution you'd like
Build engine with FP8
Describe alternatives you've considered
No response
Additional context
No response
Validations