Feedback on design - Githubissues

mratsim commented 6 years ago

Hello team,

Like you I'm writing my own neural network library for a super niche language, Nim. I'm always interested into what other niche languages devs are doing in the NN domain (D, Elm, Elixir, Rust, Ocaml, Clojure ...) and I see that you are taking a completely different approach from most and from the state of the art (Tensorflow, PyTorch, Caffe, Mxnet) which I find interesting but also questionable.

Let's start with the interesting part.

I find your neurons/synapses approach interesting, is there any research or documentation that highlights the benefits of this modelization? I know that there was some research on modelling AI like a brain (visual cortex, sound, memory separated, a thinking part ...) that was completely put aside with the advance of gradient descent techniques, but I have a hard time getting my hands on that.

Especially I'm looking into the NEURON_TYPES = ["memory", "eraser", "amplifier", "fader", "sensor"] and the Synapse type and what would be future applications that would be eased by that approach?

Questionable design

While I find the neurons/synapses approach interesting, I have several reserves considering your current implementation, some are fixable but other might require a complete rewrite:

No matrix, ndarrays, tensor type. You probably want to define custom matrix types with common functions instead of having all layers define its own loops.
Storing by one neuron at a time is inefficient. I don't think the current architecture can scale to networks with millions of parameters. I don't know how Crystal classes work but if they allocate on the heap each access to a neuron will require pointer dereferencing. The main bottleneck in NN is memory access and that will make it much worse. Furthermore, this is unmappable to BLAS, Cuda or OpenCL for efficient computations.
Synapse will be a performance bottleneck, the connection of neurons between 2 fully connected/dense/linear layers can just be represented as a matrix multiplication both for the forward and the gradient.

In the roadmap ?

Depending on your use case (research vs production), you might want to add a way to slice the data.

ArtLinkov commented 6 years ago

Hi @mratsim, Thank you for taking the time to go over the code, it is very appreciated.

First a few words regarding SHAInet. It is a free-time project, happily hosted by NeuraLegion that was created as part of some internal research. As mentioned, we started it with research in mind, rather than production, and just kept going, also thanks to members of the community. We decided to share it with the community as an open-source project, in the hope it will grow into something useful.

Regarding the interesting parts: Personally, I'm a biologist (M.Sc), and wanted to try and implement some inspiration from the biological world into this project. In addition to that, we wanted to try an approach to NN using object-oriented modeling instead of matrices. The main reason behind that was, as you keenly noticed, to try new types of neurons aiming for more robust learning (if possible) or at least have more fine-tuned control over the manipulation of each neuron (which is difficult using a matrix-driven approach).

A little about Neuron types. In biological systems, there are different types of neurons with different roles that can interpret the same signals differently. A crude example of that would be, a hormone (signaling molecule) can have different, even opposite, effects on different neurons (or many other types of cells to be more accurate). Although we don't understand how the brain works, we do know that the variety is critical for the extremely complex functions the brain can perform. So, with that in mind we thought to try and introduce similar logic to NNs. Here is a bit of what we thought of doing with different neuron types:

Memory - Basically a standard container for the numbers in the NN, a matrix of these neurons is the same as just a regular number matrix. This neurons' job is to store information and pass it forward.
Eraser - Pretty much the opposite of a memory neuron. The idea is that these neurons with get information, but instead of passing it onward as a memory neuron would, they erase some of the data (based on some specific criteria) before passing it. It might sound similar to something like, ReLU, but the idea was that we could combine different such neurons in each layer (with the purpose of more robust behavior)
Amplifier - Quite simple. Get a weak signal, amplify it.
Fader - Opposite from Amplifier.
Sensor - This one is tricky, the idea is that it will be outside the regular network, and connect to the output, or other layers, directly and introduce some manipulations on other neurons based on a more "big picture" view. This idea needs some more polishing of however.

Now, with those ideas in mind, the strongest point we were trying to achieve was to increase internal layer complexity without the necessity of pre-design. For example, imagine that a layer could be constructed with multiple types of neurons in a random distribution within the layer, potentially providing more robust behavior than "specialized" layers.

Regarding performance We started with the assumption of a research project, and performance wasn't in mind, we quickly noticed however that Crystal using LLVM optimizations can run with this design style faster then another CPU bound NN lib like FANN for example. This for us is more then enough right now, we do acknowledge that in some future we might hit the "this is too slow for real wold usage" because there is no GPU support, and so projects like jet have emerged.
We don't totally dismiss the idea of having GPU support in SHAInet but due to our lack of Matrix based logic it might be hard to add.

Some more thoughts Regarding your suggestion of data slicing, we agree that a more established Data class is needed. We had some help with this from the community, we do however want to improve this point further.

As a final note, this being a "free time open source project", the progress depends on our free time :) We try our best with improving and implementing, but we are very welcoming of any type of help.

mratsim commented 6 years ago

I see,

If you are interested in biomimetic/neuromorphic neural networks (bio-based instead of matrix based), you might find the publications by Patrick Pirim interesting, he spent his whole career on bio-inspired computing and is researching a self-driving car alternative that is not powered by deep learning.

I see your approach as micro-based while matrices are macro-based i.e. introduce diversity in the neuron ecosystem at the individual level vs at the level of species (layers).

Regarding FANN, I didn't know about it, looking into the data structure, I'm sure it has the same performance issues as it maintains a list/vectors of connected neurons for each neurons which grows geometrically.

I'm happy to discuss design and performance implications. I can't contribute code unfortunately due to lack of time and Crystal not being a language I know at all.

mratsim commented 6 years ago

Since my questions were answered I'm closing this. Thank you again.

NeuraLegion / shainet

Feedback on design #67

Let's start with the interesting part.

Questionable design

In the roadmap ?