hikettei / cl-waffe2

[Experimental] Graph and Tensor Abstraction for Deep Learning all in Common Lisp
https://hikettei.github.io/cl-waffe2/
MIT License
132 stars 5 forks source link

[Enhancement] Compiling time is remained to be optimized. #4

Closed hikettei closed 1 year ago

hikettei commented 1 year ago

cl-waffe2 instantly generates/compiles forward kernel depending on given tensors' dimensions, and views. This approach allows me to reduce the computing time of multidimensional offsets, and schedule multithreading in advance. However, this compiling is never done at the top level, but the (compile nil ...) function. 80% of compiling time consists of this kernel compiling time (e.g.: expands of SinNode).

For example, (!sin (!sin (!sin x))) uses the completely same code at each time, albeit we need three times compiling. Therefore, one primary strategy to reduce compiling time is to reuse the compiled kernels.

hikettei commented 1 year ago

On compiling, the costs of (!sin x) and (!sin (!sin (!sin x))) should be the same because all tensors used here, has the same shape, same views.

hikettei commented 1 year ago

Goal: 3 Layers MLP's compiling time of forward and backward << 5sec.

Compiling time with cache.lisp will be approximated as:

O((the number of kernel types used in nodes))

while without cache.lisp:

O((the number of operation))
hikettei commented 1 year ago

The latest pull request #10 solved this issue, however, there still remained to be optimized of compiling time, especially in backward compiling...

hikettei commented 1 year ago

(Compiling MLP Time) is now <<0.5s