microsoft / nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
MIT License
952 stars 158 forks source link

General TopK Kernel for HLSL devices #428

Closed wenxcs closed 2 years ago

wenxcs commented 2 years ago

This branch still has some problems with cross thread-group sync. The test current enabled very rarely crashes.