xrsrke / pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
MIT License
76 stars 17 forks source link

Port cuda kernels #50

Open 3outeille opened 9 months ago

3outeille commented 9 months ago

@abourramouss @xrsrke

Here is the dummy template to port CUDA kernels. It's very naïve but should be okay for now. There is a script to setup cudatoolkit and such and need some adjustements to make it terminal agnostic