NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/index.html
Apache License 2.0
1.85k stars 309 forks source link

Link attention docs to the main docs and fix errors reported by Sphinx #1062

Closed ptrendx closed 2 months ago

ptrendx commented 2 months ago

Description

Linked the attention docs to the html docs, fixed errors reported by Sphinx (cyclical imports, wrong indentation, wrong section names). Additionally tried to make the attention docs render nicer in HTML (turns out nbsphinx does not support many things that were used in that notebook, like <br> inside the table)

Type of change

Changes

Please list the changes introduced in this PR:

Checklist:

ptrendx commented 2 months ago

/te-ci