hikettei / cl-waffe2

[Experimental] Graph and Tensor Abstraction for Deep Learning all in Common Lisp
https://hikettei.github.io/cl-waffe2/
MIT License
122 stars 5 forks source link

[WIP] Petalisp as a high-level IR? #150

Closed hikettei closed 1 month ago

hikettei commented 7 months ago

(This article is WIP)

An overview of Petalisp

https://github.com/marcoheisig/Petalisp/tree/master

(As far as I know), Petalisp is a DSL implemented in Common Lisp for generating parallelized array processing codes, providing:

RISC or CISC?

Deep Learning models are everywhere, but what about the technology behind them? Many deep learning frameworks are in development today, and there are DL compilers with a focus on efficient inference (or training). TVM could be one of the good options, but when you want to make a model specific to an arbitrary environment, there are always compatibility issues (e.g.: https://github.com/pytorch/pytorch/issues/49890, this is a case of PyTorch though).

Concretely speaking, It is possible to implement gemm for many devices (e.g.: CPU, GPU, NEON, AVX, Metal etc...) and many data types (e.g.: uint8, int8, int16, ..., float16, FBloat16, float32, ...). But can it be easier?

With Petalisp, once written at a higher layer, it can be run on various backends instead of implementing gemm (like a template).

;; Petalisp
(defun matrix-multiplication (A B)
  (lazy-reduce #'+
   (lazy #'*
    (lazy-reshape A (transform m n to n m 1))
    (lazy-reshape B (transform n k to n 1 k)))))

as well as tinygrad:

# Tinygrad
c = (a.reshape(N, 1, N) * b.permute(1,0).reshape(1, N, N)).sum(axis=2);

Users don't need anymore to worry about parallelization; just rely on the compiler.

If TVM were CISC, tinygrad would be a RISC.

Why Petalisp is a good choice for replacing the cl-waffe2 compiler