cazala / synaptic

architecture-free neural network library for node.js and the browser
http://caza.la/synaptic
Other
6.91k stars 666 forks source link

Pipeable interfaces (WIP) #145

Open menduz opened 7 years ago

menduz commented 7 years ago
let midiReader = new SynapticMidi.Reader(midiInput);
let midiOutput = new SynapticMidi.Writer(midiOutput);

let network = new Synaptic.Network( /* ... network description ... */);

let networkInputStream = network.getInputStream();
let networkOutputStream = network.getOutputStream();

midiReader.pipe(networkInputStream);
network.pipe(networkOutputStream);

// debug
let monitor = new ConsoleStreamMonitor();
network.pipe(monitor);

Synaptic networks (pure and optimised ones) should expose two different methods:

The returned objects MUST reference the source network in order to keep track and read options for the network. That property is called _synaptic_network.

let network = new Synaptic.Network( /* ... network description ... */);

let networkInputStream = network.getInputStream();
let networkOutputStream = network.getOutputStream();

assert.equal(network, networkInputStream._synaptic_network);
assert.equal(network, networkOutputStream._synaptic_network);

Training

Trainer must also implement those methods

abstract class StreamTrainer {
  _synaptic_network: Synaptic.Network;
  inputStream: Synaptic.Stream;
  outputStream: Synaptic.Stream;
  async train(options: ITrainerOptions): boolean;
}
let midiInput = new SynapticMidi.Reader(midiInputFile);
let midiTarget = new SynapticMidi.Reader(midiOutputFile);

let network = new Synaptic.Network( /* ... network description ... */);

let trainer = new StreamTrainer({
  network,
  inputStream: midiInput,
  outputStream: midiTarget
});

// neither inputStream nor outputStream are writable till .train is called

let success = await trainer.train({
  rate: .1,
  error: .005,
  timeout: 10000
});

if (success == true) {
  let midiReader = new SynapticMidi.Reader(midiInput);
  let midiOutput = new SynapticMidi.Writer(midiOutput);

  let networkInputStream = network.getInputStream(); 
  let networkOutputStream = network.getOutputStream();

  midiReader.pipe(networkInputStream);
  network.pipe(networkOutputStream);
}
Jabher commented 7 years ago

I can even see use-case you're talking about, but are you sure this should go into the core? I cannot imagine lot of use-cases when we do have an actual stream (not emulated one) which is feeding the network with the data with proper speed. I can recall only FS steams and HTTP steams as native ones, and speaking about FS we should not forget the fact that they are not an actual steams, but slightly emulated (on C++ level) ones and are actually keeping everything in memory by default. Probably two more streams can be WebRTC and WebAudio API and in their case we will probably encounter a backpressure issue.

For most of use-cases end-user will probably encounter issues with data split and mutation (both for HTTP and FS) - stream will produce chunk which is not an actual activation sequence but something splitted at a random point.

Second issue can be a backpressure - I'm pessimistic in the question "how complex can NN be?" and I'm totally sure (as I checked some similar stuff with convnetjs) that double ConvNet which is required for image recognition is not able to process the B/W video stream at even 10 FPS on my i7 MacBook Pro.

Thing is that every feature should be followed with nice description with use-case, otherwise it is either pointless or over-complicated. For this thing I definitely can imagine something like WebRTCCameraStream.pipe(artisticStyleTransferNet), and I definitely can imagine it will be working... but disappoint will follow this as user experience will be definitely not smooth, it will be around 1 frame per 10 seconds.

On other hand, streams can be easily embedded later as we will transform them into event handling anyway inside.

Also I'd rather not declare clearly the variable names or something like that - we're working on draft right now, right? Possibly instead of abstract class StreamTrainer { _synaptic_network: Synaptic.Network; we'd better just say that network itself should be exposed while using the trainer, without describing the actual API as it's totally premature right now.

menduz commented 7 years ago

With @cazala, come with the idea on a train trip last week. This is a WIP draft.

Imagine this use case:

let src = new SynapticWebcam();

let network = new Network(/* ... */);

let output = new SynapticCanvas.Writer(myCanvas);

src.pipe(network).pipe(output);

It looks great, nowadays synaptic is just a raw tool, imagine a lathe without any tool it only looks cool but it's very difficult to use without the right tools.

We are defining a interface to allow pluggable tools.


Streams, emulated or not, are great interfaces to allow data flows because they are well defined and are easy to implement. You are also able to read little chunks of data if you have a low qty of inputs.

Although, each stream must be designed in order to be pipeable to N-input network and allow any N-output network too. For example, if you have only 100 inputs, and your audio input is created to take 256 samples, the input stream must transform and pipe a 100-sample chunk. That transformation must be performed on the stream and not in the network. Which is actually very dumb on this feature:

// pseudocode
class SynapticInputStream extends SynapticWritableStream {
  constuctor(public network: Network){
    super();
    this.on('pipe', (src: SynapticStream) => {
      src.on('data', (chunk) => {
        this.network.activate(chunk);
        if(src.network){
          this.network.on('propagated', (learningRate, targets) => {
            src.network.propagate(learningRate, targets);
          });
        }
      });
    });
  }
}

let webAudioAdapter = new SynapticWebAudio.Reader(64 /* samples */)
let output = new SynapticConsole.Writer();

let network1 = new Network(...);
let network2 = new Network(...);
let network3 = new Network(...);

network1.optimize(new WebWorkerOptimiser());
network2.optimize(new WebWorkerOptimiser());
network3.optimize(new WebWorkerOptimiser());

webAudioAdapter.pipe(network1).pipe(network2).pipe(network3).pipe(output)

// now you have 3 networks running in different threads, performing a lot more calculations than we are able today, in parallel

It may not be on the core implementation and it can be a separated package. It doesn't matter.


_synaptic_network you are right. It's too premature to define this now.

Jabher commented 7 years ago

I'm definitely agree that it is not very complicated thing to implement in draft, and that it will not require any changes in core.

Although, each stream must be designed in order to be pipeable to N-input network and allow any N-output network too. For example, if you have only 100 inputs, and your audio input is created to take 256 samples, the input stream must transform and pipe a 100-sample chunk. That transformation must be performed on the stream and not in the network.

I'm not sure that's a good idea. Better approach would be to expect network to be compatible with stream.

No behavior should be hidden - we can hide some internal logic, but history tells us that "magic" usually causes 10 nice scenarios when it's working and 1000 scenarios when developers are fighting this magic. Look at Angular 1 :)

But note about the fact that some streams are producing fixed-size chunks itself is nice.

And second question - what is expected behavior with backpressure? E.g. when camera stream is producing 30 frames per second, but NN is able to process only 2? Should they be dropped, or stacked (with leaking memory) or what?

I definitely bet that it will be common scenario.

menduz commented 7 years ago

Good question.. I've to think about it. Any ideas? On Mon, Sep 26, 2016 at 08:40 Vsevolod Rodionov notifications@github.com wrote:

I'm definitely agree that it is not very complicated thing to implement in draft, and that it will not require any changes in core.

Although, each stream must be designed in order to be pipeable to N-input network and allow any N-output network too. For example, if you have only 100 inputs, and your audio input is created to take 256 samples, the input stream must transform and pipe a 100-sample chunk. That transformation must be performed on the stream and not in the network.

I'm not sure that's a good idea. Better approach would be to expect network to be compatible with stream.

No behavior should be hidden - we can hide some internal logic, but history tells us that "magic" usually causes 10 nice scenarios when it's working and 1000 scenarios when developers are fighting this magic. Look at Angular 1 :)

But note about the fact that some streams are producing fixed-size chunks itself is nice.

And second question - what is expected behavior with backpressure? E.g. when camera stream is producing 30 frames per second, but NN is able to process only 2? Should they be dropped, or stacked (with leaking memory) or what?

I definitely bet that it will be common scenario.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cazala/synaptic/issues/145#issuecomment-249547235, or mute the thread https://github.com/notifications/unsubscribe-auth/AAP4Es01ynz2C2txDm-F8mdLbAuWMQpLks5qt68lgaJpZM4KF5xx .

Jabher commented 7 years ago

That's most confusing point for me actually :) Hmm, if we definitely want streams, probably using RxJS would be one of best approaches in that case. They do have lot of ways to manipulate the streams, and they do have node bindings (OK, webRTC bindings are missing, but that's not a big deal) and they can handle the backpressure more or less properly.

i'd rather suggest then to keep core library lean, but to implement rx-synaptic package too.