Closed xrsrke closed 7 months ago
This PR contains a single FP8 Linear layer supporting both forward pass and backward pass. The next PRs will enable FP8 end-to-end training.
This PR contains a single FP8 Linear layer supporting both forward pass and backward pass. The next PRs will enable FP8 end-to-end training.