LSTM - Doing my own version of `words2vec` - how do I get *hidden* outputs?

josiahbryan commented 6 years ago

Pretty picture:

Based on this article [1], the words2vec algo basically trains an LSTM to predict a word given the context, then (here's the key part for me), to get the vector for a given word, they feed to word to the trained LSTM model and take the output "at the hidden layer [as] the ‘word embedding’ of the input word."

My question therefore is, how can I get the "output at the hidden layer" of my lovely trained LSTM?

And WHY would I even want to do this? Well, my application has nothing to do with words, but still has to do with comparing contexts. I have a sequence of actions - for example, potty training a pet: "Wine" -> "Go Outside" -> "Go Potty" -> "Come Inside" -> "Get Rewarded". My hope is to be able to train the LSTM to predict the next action - great, lovely. But what I really want is to get the "vector embedding" of the input action - a vector representing the action, like words2vec has vectors representing words. Hence my desire to get the output at the hidden layer of the LSTM.

Dooable - yay/nay?

[1] Reference article: https://towardsdatascience.com/word2vec-a-baby-step-in-deep-learning-but-a-giant-leap-towards-natural-language-processing-40fe4e8602ba

josiahbryan commented 6 years ago

Been working on this more ...this is really critical now for me. I'd love to talk about how to accomplish this ..that is, get the output of the hidden layer in an LSTM network. Thoughts ...?

robertleeplummerjr commented 6 years ago

Dooable - yay/nay?

Absolutely. In-fact the default in v1 is simply trying to bind characters to neurons where it sounds like you want to bind directly to the words or perhaps get inside the running neural net to achieve all sorts of interesting learnings.

A place to start looking at would be the DataFormatter. DataFormatter` class: https://github.com/BrainJS/brain.js/blob/develop/src/utilities/data-formatter.js Tests to see how it can be used: https://github.com/BrainJS/brain.js/blob/c58765c74b5968bea90f04d10c3270d63192b1d1/test/utilities/vocab.js

Use it like this:

const net = new brain.recurrent.LSTM({
  dataFormatter: new DataFormatter(['Wine', 'Go', 'Outside', 'Potty', 'Come', 'Inside', 'Get', 'Rewarded'])
});

There is also more slicing and dicing you can do in: https://github.com/BrainJS/brain.js/blob/develop/src/recurrent/rnn.js#L316

This will give you whatever level of granularity you'd like, though not totally ideal.

If you just extended the LSTM class with your own. I want to point out that this will be getting easier in v2, as what happens in run is more to do with running the layers, and this type of transformation is part of the layers, so it simply becomes configuration.

In any case, here is a word neural network that should preform much better than its character neural network counterpart: https://jsfiddle.net/robertleeplummerjr/novxdht9/1/

var net = new brain.recurrent.LSTM();

net.train([
  ['Go','Outside'],
  ['Go', 'Potty'],
  ['Come', 'Inside'],
  ['Get', 'Rewarded']
]);

console.log(net.run(['Go'])); // -> 'Potty'

josiahbryan commented 6 years ago

Beautiful! I am out and about and away from my computer right now, but I will play with this concept as soon as I am able. I appreciate your insightful input and the excellent links you provided, looking forward to getting into the code.

On Thu, May 17, 2018, 12:35 PM Robert Plummer notifications@github.com wrote:

Dooable - yay/nay?

Absolutely. In-fact the default in v1 is simply trying to bind characters to neurons where it sounds like you want to bind directly to the words or perhaps get inside the running neural net to achieve all sorts of interesting learnings.

A place to start looking at would be the DataFormatter.DataFormatter` class: https://github.com/BrainJS/brain.js/blob/develop/src/utilities/data-formatter.js Tests to see how it can be used:

https://github.com/BrainJS/brain.js/blob/c58765c74b5968bea90f04d10c3270d63192b1d1/test/utilities/vocab.js

Use it like this: ```js const net = new brain.recurrent.LSTM({ dataFormatter: new DataFormatter(['Wine', 'Go', 'Outside', 'Potty', 'Come', 'Inside', 'Get', 'Rewarded']) });

There is also more slicing and dicing you can do in: https://github.com/BrainJS/brain.js/blob/develop/src/recurrent/rnn.js#L316

This will give you whatever level of granularity you'd like, though not totally ideal.

If you just extended the LSTM class with your own. I want to point out that this will be getting easier in v2, as what happens in run is more to do with running the layers, and this type of transformation is part of the layers, so it simply becomes configuration.

In any case, here is a word neural network that should preform much better than its character neural network counterpart: https://jsfiddle.net/robertleeplummerjr/novxdht9/1/
var net = new brain.recurrent.LSTM();

net.train([
  ['Go','Outside'],
  ['Go', 'Potty'],
  ['Come', 'Inside'],
  ['Get', 'Rewarded']
]);

console.log(net.run(['Go'])); // -> 'Potty'

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<https://github.com/BrainJS/brain.js/issues/208#issuecomment-389929950>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEmSLBz_9mwblCziDeB4OoskNO-9PKC8ks5tzabhgaJpZM4T9J6w>
.

josiahbryan commented 6 years ago

Hey, just was reading through your comments and looking at the code - great pointers on the input side of things - but I didn't really grok how to get at the hidden output of the network.

So, say I train a 3-layer net (input, hidden, output) on a set of sequences, for example:

var net = new brain.recurrent.LSTM();

net.train([
    ['go_outside','go_potty'],
    ['go_potty','come_inside'],
    ['come_inside','get_reward']
])

Say training goes great and now net.run('go_potty') gives come_inside.

What I really want is to do net.run('go_potty'), and completely ignore the result - instead somehow inspect net, and say (pseudocode) net.getLayer("hidden").nodes[x].output.

If I could get all the outputs of the "hidden" nodes, they would form the vector embedding of go_potty ...

Can you point me to how to access the outputs of the hidden nodes ...? I greatly appreciate your patience and help, thanks for putting up with all these questions!

robertleeplummerjr commented 6 years ago

I'll see if I can be helpful, but in v1 this will not seem very intuitive.

class Word2VecLSTM extends brain.recurrent.LSTM {
  constructor(options) {
    super(options);
    this.outputs = null;
  }
  run(rawInput = [], maxPredictionLength = 100, isSampleI = false, temperature = 1) {
    if (!this.isRunnable) return null;
    this.outputs = [];
    const input = this.formatDataIn(rawInput);
    const model = this.model;
    const output = [];
    let i = 0;
    while (model.equations.length < maxPredictionLength) {
      this.bindEquation();
    }
    while (true) {
      let previousIndex = (i === 0
        ? 0
        : i < input.length
          ? input[i - 1] + 1
          : output[i - 1])
          ;
      let equation = model.equations[i];
      // sample predicted letter
      let outputMatrix = equation.run(previousIndex);
      let logProbabilities = new Matrix(model.output.rows, model.output.columns);
      this.outputs.push(logProbabilities);
      copy(logProbabilities, outputMatrix);
      if (temperature !== 1 && isSampleI) {
        /**
         * scale log probabilities by temperature and re-normalize
         * if temperature is high, logProbabilities will go towards zero
         * and the softmax outputs will be more diffuse. if temperature is
         * very low, the softmax outputs will be more peaky
         */
        for (let j = 0, max = logProbabilities.weights.length; j < max; j++) {
          logProbabilities.weights[j] /= temperature;
        }
      }

      let probs = softmax(logProbabilities);
      let nextIndex = (isSampleI ? sampleI(probs) : maxI(probs));

      i++;
      if (nextIndex === 0) {
        // END token predicted, break out
        break;
      }
      if (i >= maxPredictionLength) {
        // something is wrong
        break;
      }

      output.push(nextIndex);
    }

    /**
     * we slice the input length here, not because output contains it, but it will be erroneous as we are sending the
     * network what is contained in input, so the data is essentially guessed by the network what could be next, till it
     * locks in on a value.
     * Kind of like this, values are from input:
     * 0 -> 4 (or in English: "beginning on input" -> "I have no idea? I'll guess what they want next!")
     * 2 -> 2 (oh how interesting, I've narrowed down values...)
     * 1 -> 9 (oh how interesting, I've now know what the values are...)
     * then the output looks like: [4, 2, 9,...]
     * so we then remove the erroneous data to get our true output
     */
    return this.formatDataOut(
      input,
      output
        .slice(input.length)
        .map(value => value - 1)
    );
  }
}

Then you can just do:

const net = new Word2VecLSTM();
net.train();
net.run();
console.log(net.outputs);

// Do more with outputs...

Note: I just grabbed the default RNN.run method, and added a couple this.outputs and append the outputs of the net to it. The v2 counterpart for this is still in speculation (ie, still in my mind, not yet started on writing it) but the Recurrent building block (the simplest form of a recurrent net, which forms the vector part of what you need) has been built as a proof of concept. If you'd like to start getting your hands dirty I welcome the energy.

If you have no interest in v2, stop here. Continuing: This is the same old cruft that I've mostly seen from machine learning library to machine learning library. It probably makes scientists super happy, but it isn't very practical. Internally it operates much more like React or Redux, but it isn't fun and it surely isn't something I want to teach my kids.

I'm hoping we can get some sort of strategy that will be much more english like (self intuitive, self explaining, forth coming, easy, fun) that the above. Like, if you wanted to put the net together in a Word2VecLSTM you could do something like:

import { Recurrent, layer } from 'brain.js';
const { inputSymbols, lstm, output } = layer;
const word2VecLSTM = new Recurrent({
  inputLayer: () => inputSymbols({ options }),
  hiddenLayers: [
    (input, recurrentValue) => lstm({ options }, input, recurrentValue), // first hidden layer
    (input, recurrentValue) => lstm({ options }, input, recurrentValue), // second hidden layer
    (input, recurrentValue) => lstm({ options }, input, recurrentValue), // third hidden layer
  ],
  outputLayer: (input) => output({ options }, input)
});

word2VecLSTM.train();
word2VecLSTM.run();

Note: inputSymbols doesn't yet exist, but pretty much everything else for v2 does.

Warning: I'm showing off the new architecture to gain followers and potentially help and as well continue and try to start a revolution. As you can see the upcoming version goes from being cryptic, obtuse, and pedantic, to being composable and straightforward, or at least that is the idea. Building a simple wrapper to fit with close to the existing feedforward neural like NeuralNetwork({ }) becomes super easy.

The idea is like this: it is a disservice to the world to build a neural network exactly the way I need it, and call it a library when it is so specific to a problem that all it solves it my problem. If we are creating a library that is worth a revolution, it shouldn't limit creativity, it should harness it.

Just for my own amusement, the existing brain.recurrent.LSTM class could be replaced by something like:

import { Recurrent, layer } from 'brain.js';
const { inputSymbols, lstm, outputSymbols } = layer;
const superRevolutionStartingReplacementLSTM = new Recurrent({
  inputLayer: () => inputSymbols({ options }),
  hiddenLayers: [
    (input, recurrentValue) => lstm({ options }, input, recurrentValue), // first hidden layer
    (input, recurrentValue) => lstm({ options }, input, recurrentValue), // second hidden layer
    (input, recurrentValue) => lstm({ options }, input, recurrentValue), // third hidden layer
  ],
  outputLayer: (input) => outputSymbols({ options }, input)
});

superRevolutionStartingReplacementLSTM.train();
superRevolutionStartingReplacementLSTM.run();

Note: inputSymbols and outputSymbols doesn't current exist, but pretty much everything else does.

Hopefully this is seen less as a rant and more of a call to action, even of myself. I'm currently working on some convolution layers for v2, for guys who need it for face recognition (@justadudewhohacks you know who you are ;)) and then I'm going to be building the above.

LiamDobbelaere commented 6 years ago

@robertleeplummerjr Why is the error rate of your example so high? If I run:

var net = new brain.recurrent.LSTM();
net.train([
  ['Go','Outside'],
  ['Go', 'Potty'],
  ['Come', 'Inside'],
  ['Get', 'Rewarded']
], {iterations: 20000, log: true, logPeriod: 100 });

console.log(net.run(['Come']));

iterations: 19900 training error: 4.032459660935966

I'll get 'Potty' as well, it's response to anything seems to be 'Potty' and it hasn't learned of the word associations at all. The training error keeps being above 4 so this seems to be a pretty awful example. Do you have any idea why this is?

robertleeplummerjr commented 6 years ago

lol, Potty....

robertleeplummerjr commented 6 years ago

I'm back in the office today, so I'll need to get caught up, then check this out.

robertleeplummerjr commented 6 years ago

@TomDobbelaere resolved here: https://github.com/BrainJS/brain.js/pull/261 It was a long standing issue that turned out to be very simple to resolve.

LiamDobbelaere commented 6 years ago

Strange, I still have the same problem. I tried installing brain.js via npm first, then I thought maybe that's not the latest version. Then I installed the master branch from github, still the same issue.

Finally I checked the v1.x branch, because that's where you merged it, so I did npm install git://github.com/BrainJS/brain.js.git#v1.x --save, but it still gives me potty for everything.

robertleeplummerjr commented 5 years ago

Is this still an issue in v1.5.1?

LiamDobbelaere commented 5 years ago

Looks like it. I installed 1.5.1 (also tried 1.5.2) and ran the same example. It's still the same problem, always goes to Potty no matter what.

robertleeplummerjr commented 2 years ago

Closing due to inactivity.

josiahbryan commented 2 years ago

Hmmm...why was this closed? Tom confirmed 2 years ago it was a n issue. Was it fixed...?

On Wed, Apr 13, 2022 at 1:44 PM Robert Plummer @.***> wrote:

Closing due to inactivity.

— Reply to this email directly, view it on GitHub https://github.com/BrainJS/brain.js/issues/208#issuecomment-1098370197, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEZELGRJYSQW5N7UXLGO2DVE4IYZANCNFSM4E7UT2YA . You are receiving this because you authored the thread.Message ID: @.***>

-- Josiah Bryan +1-765-215-0511 (Phone/SMS/WhatsApp) www.josiahbryan.com https://www.josiahbryan.com/?utm_source=sig @.***

robertleeplummerjr commented 2 years ago

Hmmm...why was this closed?

Sorry, I should have done a better job here. I believe this issue to be resolved with at least version 2.0.0-beta.14. @TomDobbelaere you care to try it again? Been trying to clean things up so we can move forward with other initiatives.

BrainJS / brain.js

LSTM - Doing my own version of `words2vec` - how do I get hidden outputs? #208

BrainJS / brain.js

LSTM - Doing my own version of `words2vec` - how do I get *hidden* outputs? #208

LSTM - Doing my own version of `words2vec` - how do I get hidden outputs? #208