Maratyszcza / NNPACK

Acceleration package for neural networks on multi-core CPUs
BSD 2-Clause "Simplified" License
1.67k stars 315 forks source link

AltiVec/PowerPC (OpenPOWER ISA 3.0B or greater) Acceleration Support #195

Open justinlynn opened 3 years ago

justinlynn commented 3 years ago

Some of the TOP50 Supercomputers run OpenPOWER ISA Compatible CPUs (POWER9, etc) - Summit, et. al. Given that and my personal desire to run inference and training on my own OpenPOWER-based systems, it would be extremely useful to support using these massively multi-threaded CPUs (POWER9 has 24 cores w/ 4 threads per core, for example) with extremely high memory bandwidths (200 GB/s+ per socket) with NNPACK. In order to support this, Altivec compatible implementations of NNPACK algorithms would need to be added. A first step might be to implement the Intel-compatible intrinsic shims for SSE intrinsic primitives. I would be interested in doing this and then proceeding to full implementation - would you be willing to entertain accepting such additions into the project (assuming ppc support is also provided for the cpuinfo library per https://github.com/pytorch/cpuinfo/issues/2 )?

justinlynn commented 3 years ago

Also, I can provide ongoing test/development/continuous integration resources for NNPACK on several Raptor Computing Systems Talos II (IBM POWER9-based) systems I own and operate.