coreylowman / dfdx

Deep learning in Rust, with shape checked tensors and neural networks
Other
1.72k stars 99 forks source link

How about integrate CUDA kernal lauching with Rust async? #801

Closed npuichigo closed 1 year ago

npuichigo commented 1 year ago

First of all, I am very excited to see such a project.

I'm a researcher in deep learning, but also have some interest in HPC and async programming in C++. As far as I know, C++26 is persuing a unified abstraction for async computation. Since CUDA kernals are lanched asynchrously, it's perfetly suited into this framework. (FYI: https://github.com/NVIDIA/stdexec/tree/main/examples/nvexec and https://www.hpcwire.com/2022/12/05/new-c-sender-library-enables-portable-asynchrony/)

I'm quite interested in the trail in deep learning framework to further improve performance. Rust also has generic and native support for async, maybe there's something that could be done.

Thanks.

coreylowman commented 1 year ago

Yeah this has been brought up before, for now this is not a goal of this project. It would require an async cuda library (cudarc has no async support), and then pretty much a full rewrite of everything to be async. So too much work at this point. It's a cool idea though!