Open daphne-eu opened 2 years ago
Motivation: Deep neural networks (DNNs), such as convolutional neural networks (CNNs), are widely used in machine learning. They consist of multiple layers, which consist of basic operations. DNN workloads are often executed on hardware accelerators (e.g., GPUs). In fact, DAPHNE offers many typical DNN operations and CUDA-based GPU kernels for them. Nevertheless, it would be helpful to have purely CPU-based implementations for the most important DNN operations, as they would allow users who don’t have a GPU set up to still play around with DNNs in DAPHNE (at least on small data).
Task: This task is to implement (in C++) kernels for at least the following typical DNN operations: convolution, pooling (max, avg) (already implemented), bias add, and batch normalization. These operations should be supported for the forward pass and the backward pass. That is, kernels for the following DaphneIR ops are expected: Conv2DForwardOp
, Conv2DBackwardFilterOp
, Conv2DBackwardDataOp
, MaxPoolForwardOp
, AvgPoolForwardOp
, MaxPoolBackwardOp
, AvgPoolBackwardOp
, BiasAddForwardOp
, BatchNorm2DForwardOp
, and BatchNorm2DBackwardOp
(some of these operations still need to be added to DaphneIR).
The kernels should work on DAPHNE's DenseMatrix
data type. For the data layout, the typical N x C*H*W format shall be used, i.e., each of the N rows in a matrix contains the data of one image. Each image has C channels (could be, e.g., RGB) and a size of HxW pixels. In memory, all pixels in a channel are stored contiguously; within a channel, the data is stored in row-major format.
In addition to the kernels, new unit test cases should be added in test/runtime/local/kernels/
(akin to the existing ones).
Optional extensions of the task (if desired and time is left):
test/api/cli/
).Hints:
src/ir/daphneir/DaphneOps.td
(search for Deep neural network
).src/runtime/local/kernels/CUDA/
). For the others, take inspiration from the existing ones.cpp
-files in test/CMakeLists.txt
.A PR/commit solving this issue should also update the DaphneDSL built-in function reference (doc/daphnedsl/Builtins.md
), which currently says "Note that most of these operations only have a CUDNN-based kernel for GPU execution at the moment."
In GitLab by @corepointer on Mar 18, 2022, 11:31
We have most of our neural network operators (convolutions et al) implemented as calls to cuDNN. An initial (pooling) operator is ported from SystemDS. The other operations need a C++ implementation (or porting), test cases and ideally a comparison to the Java version.