cazala / synaptic

architecture-free neural network library for node.js and the browser
http://caza.la/synaptic
Other
6.91k stars 666 forks source link

Discussion: Synaptic 2.x #140

Open Jabher opened 7 years ago

Jabher commented 7 years ago

So, we want to make Synaptic more mature. I've created a design draft to think on it. https://github.com/cazala/synaptic/wiki/Design-draft:-2.x-API

Let's discuss.

The most significant changes in design:

Jabher commented 7 years ago

Important question btw. Do we actually need Neuron as an exposed entity?

cazala commented 7 years ago

I guess that depends on the underlying implementation. Neuron and Layers can be replaced by just Networks that handle everything. The fact that Synaptic has Neurons is because when I first read Derek Monner's paper he described a generalized unit, that could use the same algorithm and behave differently according just to their position in the topology of the network (i.e. a self-connected neuron acts as a memory cell, a neuron gating that self-connection acts as a forget gate but the same neuron gating the input of the memory cell would act as an input gate filtering the noise. And all those neurons would essentially follow the same algorithm internally). That's what I found really cool about that paper and that's why I coded first the Neuron. Then the Layer is just an array of them, and a Network is an array of Layers. The advantage of having the individual units is that you can connect them in any way easily and try new topologies (Like LSTM w/wo forget gates, w/wo peepholes, w/wo connections among the memory cells, etc). But I know that the approach that other NN libraries take is more like matrix math at network level, instead of having the individual units. This is probably way better for optimization/parallelization so I'm up for it, as long as we can keep an easy and intuitive API, that allows the user to create flexible/complex topologies.

jocooler commented 7 years ago

I'm not totally clear on what it means to expose a neuron, but it was extremely important for my application to be able to clearly see the neurons (using toJSON). I trained the network using synaptic and implemented the results in another program (excel).

olehf commented 7 years ago

From my understanding it would be more important to expose an ability to override/specify neurones' activation function than to have direct access to neurones. In that way a developer can concentrate on implementing higher level functionality, e.g. by stacking layers or networks and having access to neurones' activation functions to implement custom networks. That assumed there are basic networks are implemented (convolutional, recurrent, Bayesian etc).

I would love to contribute to the new version if any additional help is still required.

Regards, Oleh Filipchuk

On 16 Sep 2016, at 14:50, jocooler notifications@github.com wrote:

I'm not totally clear on what it means to expose a neuron, but it was extremely important for my application to be able to clearly see the neurons (using toJSON). I trained the network using synaptic and implemented the results in another program (excel).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

Jabher commented 7 years ago

@jocooler That's a great note. From my point of view - removal of Neuron will significantly reduce memory consumption. However, this is a good reminder that we do need a human-readable export.

schang933 commented 7 years ago

Thanks for spearheading this thread @Jabher. From a user's perspective, I might add having improved documentation and examples. Especially since the site is down now (#141), it's harder to reference back to examples I've found in the past.

Specifically, I think for my use case an LSTM may be a more natural approach, but I am hesitant to test it because I have not trained one before and the mechanism for RNNs seems quite different. Having more than 1 example (preferably 3+) would help as users can pick the one that best matches their use-case. It might help to encourage users to contribute their own examples as well (maybe something I can do if I figure this out myself).

Another point should be on optimizations. I think a big reason people are using other libraries is due to limitations, especially in memory. It could help to have a short guide on the Wiki discussing how to set Node to allocate more memory before getting OOM, or using a mini batch approach for strategies that support it. Also, regarding exposing the Neuron, I may suggest something similar to compilers where the toJSON method can be human-friendly in debug mode and machine-friendly otherwise. I'm seeing my memory filled with Neurons when conducting a heap analysis.

Jabher commented 7 years ago

@olehf Convolutional (1d and 2d + max/avg pooling for them) are mentioned in design document, same for RNN - I think GRU and LSTM would be enough for average user.

Activation functions should definitely be passable to any layer, only issue we can encounter is a custom implementation of them (as we will need multiple implementations of one function) so we should keep as much as possible in design.

Help is (as usual) always appreciated. It is a lot of work to do, and any contribution will be significant. As soon as we will decide that we're good with this design, GH project board will be created with tasks to do.

Jabher commented 7 years ago

@schang933 there's new temp website - http://caza.la/synaptic/. Old one will be back soon.

Speaking about examples - this is a good concern, but first we should implement an API itself. Good set of examples can be a https://github.com/fchollet/keras/tree/master/examples (did I mention that Keras is documented in a great way?).

Another point should be on optimizations. I think a big reason people are using other libraries is due to limitations, especially in memory. It could help to have a short guide on the Wiki discussing how to set Node to allocate more memory before getting OOM, or using a mini batch approach for strategies that support it.

That's totally correct, but we should keep in mind that Node.js is not the only runtime target. Multiple browsers (latest Edge, Firefox and Chrome) with WebGL/WebCL and workers are also a runtime targets, and nodejs-chakra is also a preferrable option as it looks like it do not have memory limitations at all (it should be checked).

About Neuron exposion and keeping human-readable output - I totally agree.

arqex commented 7 years ago

Hey, thanks for this great library! It is really nice that its development is still running.

I am not an ML expert, and synaptic has helped me so much to understand how the neural networks work ,thanks especially to the prebuilt architectures. If you are dropping them from the core repo, please move them to a separate one, maybe with the examples, because it makes really simple for ML non-experts to get initiated and running.

Cheers

hongbo-miao commented 7 years ago

Most JavaScript related libraries are not actively maintained any more. Hope this one can keep going! Hope one day I can have enough knowledge contributing on this library. Cheers!

menduz commented 7 years ago
import {
  WebCL,
  AsmJS,
  WorkerAsmJS,
  CUDA,
  OpenCL,
  CPU
} from 'synaptic/optimizers';

I think those shall be separated packages. In order to keep the original package as small and simple as possible.

yogevizhak commented 7 years ago

I agree but it should be both options for easy installation On Sep 25, 2016 16:10, "Agustin Mendez" notifications@github.com wrote:

import { WebCL, AsmJS, WorkerAsmJS, CUDA, OpenCL, CPU } from 'synaptic/optimizers';

I think those shall be separated packages. In order to keep the original package as small and simple as possible.

  • That will help us to create browser bundles.
  • Will reduce the scope of the main project, helping us to create specific tests for each optimiser.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cazala/synaptic/issues/140#issuecomment-249421112, or mute the thread https://github.com/notifications/unsubscribe-auth/AMo8NrmvKGffvCAvsxLnVOqZlqxyUsIKks5qtnLVgaJpZM4J6oXe .

Jabher commented 7 years ago

@menduz I'd suggest multiple modules + 1 meta-module ("synaptic" itself) to expose them all in that case?

Jabher commented 7 years ago

There's a nice line in TensorFlow docs: https://www.tensorflow.org/versions/r0.11/api_docs/index.html

Over time, we hope that the TensorFlow community will develop front ends for languages like Go, Java, JavaScript, Lua, R, and perhaps others. With SWIG, it's relatively easy to develop a TensorFlow interface for your favorite language.

ASAP we actually need only few functions, some incomplete port of TF would not be so hard to use, and it will deal with most of server-side performance issues, so we will actually be able to train with good speed, I think.

yonatanmn commented 7 years ago
  1. I'd love to see reinforcement network implemented.
  2. More utils will be cool - I couldn't even find any npm module that normalize numbers. JS is missing many important tools to work with data. In this section I could happily contribute. some options:
normalizeNum(min, max, num) =>  0->1 // (curried?),
deNormalizeNum(min, max,  0->1) => num // (curried?), 
normalizeNumericalArray ([num]) => [0->1] // min and max from array
normalizeCategoricalArr([string]) => [0|1] //based on uniqueness. 
etc...
Jabher commented 7 years ago

reinforcement learning sounds cool but in can be a module over synaptic instead of thing inside the core - possibly some additional package. See https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js - it's not related to network itself, it's working over it.

utils - we're planning to use Vectorious for matrix operations (https://github.com/mateogianolio/vectorious/wiki) which do have lot of nice functions (including vector normalization). For curried functions Ramda (http://ramdajs.com/) functional programming lib can be used.

cusspvz commented 7 years ago

Just discovered about now the turbo.js project. Has anyone considered this as a optimizer for synaptic on the browser?

Jabher commented 7 years ago

@cusspvz I discovered it today too.

Cazala is playing with https://github.com/MaiaVictor/WebMonkeys now as a back-end for v2, this is similar, and he says it's better, I agree as it supports back-end computations too.

rafis commented 7 years ago

Please take a look onto APIs of Torch and adnn. Also as adnn is a competitor take a look if it has better parts.

Jabher commented 7 years ago

@rafis speaking of popular libs - probably the best reference is Keras (as it provides the most consistent API), I've been investigating through most of popular libs for that.

But thanks a lot for ADNN - that lib looks very interesting, and can be investigated deeply.

corpr8 commented 7 years ago

Gpu acceleration (gpu.js?) composite networks distributed across multiple hosts (mongo + socket.io?)

yonatanmn commented 7 years ago

some method to modify weights of connections. I need it for evolution algorithm of NN. current solution is - toJSON, manual change, fromJSON. maybe Network should have connections property pointing to the real connections , similar to the result of toJSON, and each Connection will have a method to update weight manually.

as a general approach - exposing of all the internal logic to public methods. NN has so many different use cases, and everyone needs his own configurations.

cazala commented 7 years ago

hey guys, just to let you know, I'm playing in this repo trying to build something similar to what we have in the design draft, taking all the comments here into consideration. Feedback, comments, and critics are more than welcome. This is a WIP so don't expect it to work yet :P

cusspvz commented 7 years ago

Nice work @cazala !! Are you expecting to have everything on the Engine so you can have better management on the import/export thing?

I've built a kind of from scratch and opted out them to be external so I can have various types of neurons. Do you think you can have more than the "bias" processor neuron on this?

EDIT: Also, I think it is important to define an API, or at least give some implementation room, for plugins. It would allow a community to grow underneath the synaptic one.

cazala commented 7 years ago

Thank you @cusspvz :)

  1. What do you mean by more than the bias neuron? the algorithm i'm using for synaptic (lstm-g) defines all the units equally, what changes the roll of the unit/neuron is the way it connects to the others. ie. a neuron's can be used as a gate, in this case its activation will be used as the gain of the connection that it is gating. Or a neuron can connect to itself to act as a memory cell. Or they can be connected in layers to form a feedforward net. Or a neuron can have a fixed activation of 1 and be connected to all the other units and act as a bias unit. Each unit/neuron can have its own activation function, independently of the others. If you see the the examples in the layers directory you will see there's no math, only topology. Each layer is just a different way to connect neurons, but there's no definition of how to do the math. The advantage of this is that we have a single algorithm for all the different topologies, and this can be isolated into the backend, and be then heavily optimized, or be ported to different platforms, instead of having to do that for each layer individually. The role of the engine is just to hold the values of the weights, the states, the traces, and the relationships between units (how units are connected, or gated, which units are inputs or projections of other units). It serves two purposes: first, to be storable/clonable/transportable, since it's just plain objects and arrays, and it's the minimum information required to reproduce the network elsewhere; second, to give the Backend everything it needs already served, so the Backend can focus only on doing the calculations and updating the values in the Engine.
  2. I would love to make synaptic as extensible as possible, do you have examples of these kind of plugins or APIs?
cusspvz commented 7 years ago

1.

The advantage of this is that we have a single algorithm for all the different topologies, and this can be isolated into the backend, and be then heavily optimized, or be ported to different platforms, instead of having to do that for each layer individually

Awesome, I've understood your structure just by looking at the code, seemed a lot clever when I've saw it!

I've also saw that you're introducing flow layers for better piping which is awesome for assisted neural networks but I can't see how it might help building non-assisted ones.

A brief story to explain what I mean with "processing neurons":

I see myself as an explorer, in fact, I've been self-educating Machine Learning for a while and I came up with the Liquid-State Neural Network Architecture before having the knowledge about it.

On the first versions of my Architecture, the one that was alike Liquid-State NNA, I had what I call "bias/gate/weight based neurons" working slightly different to include neuro-evolution in a different way of what people are doing. Each one had a lifespan that would be increased on each activation, once a neuron was dead, the network would notice it was ticking with lesser neurons and would compensate with a randomly placed one. That feature allowed me to control the "forgot" aspect by passing parameters to the neural network, I've hooked up the IO to my mouse cursor and it was capable of replicate my movements (without assistive training).

Note: at this point, the network was already processing async, so didn't need inputs for having it doing something, a simple neuron change could trigger activations trough the net till an Output neuron.

It worked awesome for directions and images but not so for sound patterns, so I've changed again the network and added more neurons:

All of this work is, by now, private and personal, but I would like to contribute or share if it could help the developments of the synaptic v2.

I could see some of the things fitting into the "Layers" structure, but I have some doubts related with it such as:

2.

second, to give the Backend everything it needs already served, so the Backend can focus only on doing the calculations and updating the values in the Engine. do you have examples of these kind of plugins or APIs?

a) I really like the way "babel" works out of the box using their name/prefix priorities, it could be an idea for use with the "backend", "layer" and so on like:

new Network({
  // under the hood would call 'synaptic-backend-nvidia-cuda'
  backend: 'nvidia-cuda'
})

new Network({
  // under the hood would call './backends/gpu.js'
  backend: 'gpu'
})

new Network({
  // under the hood would call './backends/web-worker.js'
  backend: 'web-worker'
})

It should be easy to implement: Note: I've attached a code snippet here, but then I've removed and placed as a gist

b) We must have a stable API for backends and layers before the first release, which means we must think on one by now.

c) When you mean "async", does it allow the network to trigger neurons as a callback (like multiple times) or only as a promise (where it just counts once when solved)?

Edit: Thanks for your time! :)

Jabher commented 7 years ago

@cusspvz Thanks for such a great feedback! And yes, actual help will be greatly appreciated. There's lot of routine coming (lots of layers, lots of backends), and any help will be great.

Speaking of what you're proposing:

  1. Layer-based design is actually universal - we're abstracting groups of neurons (with more or less complicated design) into one meta-neuron with array of outputs. On other hand, some specially-designed Layer with same external API can be implemented for the purposes of liquid state machine, or RNN, or convnet, or computation nodes.

  2. a) Actually it's already discussed, for now API is something like


import {
  AsmJS,
  TensorFlow,
  WorkerAsmJS,
  WebCL,
} from 'synaptic/optimizers';
...
await train_network.optimize(new WorkerAsmJS());

Suggested back-ends are C++ binded TensorFlow (as it supports GPGPU, multiple CPU tricks and so on), AsmJS-compiled math engine (concurrent via webWorkers and same-thread implementations), raw JS engine, and WebCL (possibly via Webmonkeys).

b) Agree.

c) Promise way, possibly. Thing is that computations will be probably working in async way, so any math operation will turn into asyncronious one.

oxygen commented 7 years ago

I'm a noob at neural networks and haven't used yours, so excuse any insults or me completely missing the boat :)

If you want to remove the Neuron to minimise memory consumption, maybe you can actually replace it with a lazy Neuron interface/class which would only read on lazy instantiation (when needed) from the better packed representation and could also allow you to modify all neurons at once (or maybe even individually?).

menduz commented 7 years ago

@oxygens Thanks, neurons doesn't even exist on the new draft!

Michael-Naguib commented 7 years ago

Thanks for a amazing library in Js!

In the upcoming 2.x API would it be possible to implement a means of converting the human readable, Json to a HDF5 format, such that it would be easy to import/export pre-trained nets from other libraries which use this format.

For example I believe it might then be possible to import google's powerful Inception net weights.

Thanks :)

alexwaeseperlman commented 7 years ago

I think the architect should still be there as it allows you to easily create mockups and it allows beginners to write neural networks without having to understand the complicated underlying mechanisms.

dburner commented 7 years ago

Regarding the API:

Regarding the implementation of the multiple backends I think Keras found a pretty nice way to deal with this, may be worth to check that out.

ghost commented 7 years ago

Just read about HTM (Hierarchical temporal memory). Impressive concept, overall looks very sense-making.

How would the experents here judge HTM compared to LSTM-G ?

HTM seems to dynamically grow the network based on learning results. Can we mimic a similar concept with Synaptic ?

snowfrogdev commented 7 years ago

How about a way to automatically normalize/standardize input/output data?

alexwaeseperlman commented 7 years ago

What about convolutional nets? They have a lot of hype and I'm nowhere near advanced enough to implement them.

ghost commented 7 years ago

Normalization: yes but we need to be able to override the minimum/maximum range to allow to add future data which may be out of range. Also depending on your data you may have to apply filters like EMA/Kalman for temporal data (like Forex) to boost your prediction performance (https://www.ncbi.nlm.nih.gov/pubmed/12628609). Just normalizing your raw data will lead to poor results and once your latest data gets our of range the predictions are useless.

Bondifrench commented 7 years ago

First I would like to congratulate on such a great library, I enjoy coding in Javascript and believe there is great potential to be applied to Machine learning. I am new to SynapticJs so forgive me if my questions appear to be noobies ones.

While neuron.js make it very readable to understand the different steps involved in building a network, for performance reasons wouldn't it be better to take advantage of Matrix calculations?

I guess there would be a cost to transform from the Network/Json view to the Matrix one but depending on the size of the network and the amount of calculations, it could be advantageous for faster computations.

I would like to point out a few libraries that were developed for fast array/matrix computations in Js:

Looking forward to see the progress on Synaptic, cheers

Bondifrench commented 7 years ago

You might want to look at this project MILJS, which is backed by academic research from the University of Tokyo, they have a few open source libraries:

ethanwillis commented 7 years ago

I second the MILJS project. I've interacted with the developers behind it some and they are fast to respond.

Jabher commented 7 years ago

Cazala has a contact with them. However, due to, emm, our architecture reasons, we cannot simply copy some parts of their code into S2. We'll be probably utilizing weblas for matrix computations (at least partially).

wagenaartje commented 7 years ago

@yonatanmn take a look at Gynaptic for evolutionary algorithm addons for Synaptic :)

Bondifrench commented 7 years ago

Any update on progress?

Wanted also to point to another ML library MLJS, which seems to be pretty modular and not only focused on ML, you can also do PCA for instance.

frabaglia commented 7 years ago

Hi everyone.

First of all, i want to congratulate you for this amazing project. Last year was very bussy for me, but actually i want to go deeply on using and contributing with this proyect, giving a hand on any that could be possible.

Is somewhere any clear state of the project? I can't found a todo list, and it's a bit hard to understand where the project is nowadays. cc @cazala @Jabher

Jabher commented 7 years ago

Hi, @FrancooHM .

It's me who is mostly researching everything now. It's not about todos, it's about attempting to implement.

I've spent, idk, 1.5 months trying to make a workaround to use concurrency, but it is dead way as it consumes crazy amount of RAM. e.g. you need to allocate weights in 8-core computer, you will need to allocate 8x times of weights you have. SharedArrayBuffer will come soon to browsers, it's already in spec, so it will solve this problem.

I already have working computation runner (similar to tensorflow), and I was going to build some demo API over it. It's in here: https://github.com/synapticjs/florida. It is neither fast or powerful now, but it is extensible, customizable, and very similar to TF. Lot of modifications will happen, but it is possible now to do tick-tack strategy, when you first tune high-level api to fit low-level nicely, then low-level to make high-level api work better and so you are able to evolutionary achieve something working greatly.

The idea is that nice API (and some automatic differentiation magic) will be moved away from actual performance and low-level stuff. I'll try to create some simple demo with it next week.

Jabher commented 7 years ago

BTW, nice example why it is important to differ compile from run. V8 has JIT inside it, so even after first run built function is much more faster - function that simple will become 10 times faster. Actual performance boost will be smaller, but still it will bring performance impact.

const x = new Tensor([1, 1]);
console.time('compile');
const runFn = operations.compileRun({
    returns: [x],
    accepts: [x],
});
console.timeEnd('compile');//=> compile: 0.968ms

console.time('run1');
expect(runFn({
    values: [[1]],
})[0].data[0]).to.equal(1);
console.timeEnd('run1');//=> run1: 1.991ms

console.time('run2');
expect(runFn({
    values: [[2]],
})[0].data[0]).to.equal(2);
console.timeEnd('run2');//=> run2: 0.163ms
basickarl commented 7 years ago

BETTER DOCUMENTATION.

wagenaartje commented 7 years ago

@basickarl I see you have made a lot of posts in the issues of Synaptic, and I think your concerns are understood; the documentation is indeed a bit poor and should be updated/improved. I might do that somewhere next week.

But please remember that the wiki is maintainded by the community (and not just the author), of which you are part as well ;)

basickarl commented 7 years ago

@wagenaartje I indeed hope my point has been seen as it would be a pity for such a library to be dissed due to poor documentation (if I didn't care I wouldn't post^^).

frabaglia commented 7 years ago

Hi again @Jabher, thanks for your time.

1) After reviewing again all this info and the new draft, I could say that I feel very comfortable as this new version will be working on the top of other very powerful and water proof backends like TF. In fact i think that this is maybe the best feature, and an oportunity to let JS users to start getting really involved on NN world, without trying to reinvent the wheel.

2) I get what you are doing, but i can't get the really big picture. As i understood, you got some definitions that are on this draft. But actually you are doing this small experiments with concurrency and this computation runner.

I imagine that you have goals, and concepts you want to probe before start developing but i can't see this... I think that this should be very clear somewhere to be more progressive, or let the community be involved on this work.

Thanks again for your time and i expect that this feedback could be useful for you.

Jabher commented 7 years ago

@FrancooHM a disclaimer: API draft was defined earlier, it is partially not actual as long as we've discovered some non-resolvable problems and understood some underlying concepts. But we were not ready to release some new version of draft to implement. I think we with @cazala will be able to produce some new API soon.

It was more research work previously - trying to understand the mistakes made by all JS NN frameworks, trying to figure out how to make it extensible, fast, and simple, and this work is mostly sole - discussions were happening, but everyone was trying to do it on their own. IDK about @cazala , but I've made around 20 experimental frameworks to reject them on different stages. Rejecting them while working with OSS team is quite complicated thing, as at some moment you will say "guys, we were going wrong way, let's delete everything and move to next ideas to try".

We needed some solid ground to make everything work, and Florida (internal engine) looks like the one to use - it's dead simple, decomposable into computation module and computation graph building API, and looks like the one which fits in all of our requirements.

About Florida, one more time - at the moment of design we were looking mostly at Keras, which is (or least looking like) a high-quality piece of code. And it was achieved mostly because they've decomposed from computations, which are stored in Theano and TF. So only sane way to implement good JS framework for that was building some analog of TF, which is computation runner in its core. That's why florida was implemented. It is made strictly with API draft, and every code part is checked in mind for compatibility with higher-level APIs to be provided later.