facebookincubator / AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Apache License 2.0
4.54k stars 363 forks source link

AIT softmax fix #982

Closed zoranzhao closed 9 months ago

zoranzhao commented 9 months ago

Summary: For S387149

Differential Revision: D52530631

facebook-github-bot commented 9 months ago

This pull request was exported from Phabricator. Differential Revision: D52530631

facebook-github-bot commented 9 months ago

This pull request was exported from Phabricator. Differential Revision: D52530631

facebook-github-bot commented 9 months ago

This pull request has been merged in facebookincubator/AITemplate@7bccbc9c959671e691e3eebc641bde8bbd405e6b.