NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/index.html
Apache License 2.0
1.6k stars 255 forks source link

Improve JAX build tool #942

Closed phu0ngng closed 1 week ago

phu0ngng commented 2 weeks ago

Description

Type of change

Checklist:

ksivaman commented 1 week ago

/te-ci pytorch

timmoon10 commented 1 week ago

I've cherry-picked this into release_v1.8 because putting the CMake build directory was causing our CI scripts to leak the build directory into our nightly containers. Putting the build directory back in the root directory fixes this problem.