NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.42k stars 1.4k forks source link

Literature associated with fused_dense #1846

Open prmudgal opened 1 month ago

prmudgal commented 1 month ago

Is there any research articles that can explain the theory behind fused_dense?

lix19937 commented 3 weeks ago

https://arxiv.org/abs/2104.08378
https://proceedings.neurips.cc/paper/2021/hash/6e8404c3b93a9527c8db241a1846599a-Abstract.html