Trouble training an identity function

spiffytech commented 7 years ago

I'm training an identity function with Neataptic, but I'm not getting accurate results. What am I doing wrong? (I'm super novice at machine learning, so I assume I'm making a simple mistake.)

const n = require('neataptic');

const vocab = Array.from(new Set('Ad libero officiis quisquam asperiores dolore fugit laborum Voluptatum minus animi iure et Eos eos modi recusandae necessitatibus Ex ad iste officia et veniam voluptas est Dolorem modi omnis praesentium voluptatum quaerat Est fuga error deleniti voluptatem Illo sapiente voluptatem voluptatem fuga sint excepturi Eum omnis in esse reprehenderit dolorum Sapiente aliquam aliquam qui modi qui optio Reprehenderit sit quisquam qui iste eius harum ut illum Qui aliquid aut repellat dicta et earum iste qui Vitae architecto excepturi enim fugiat est'.split(' ')));

function wordToOneHot(word) {
    if (word === undefined || vocab.indexOf(word) === -1) throw new Error('Must supply a word');
    const arr = new Array(vocab.length).fill(0);
    arr[vocab.indexOf(word)] = 1
    return arr;
}

function oneHotToWord(arr) {
    const max = Math.max(...arr)  // Our most confident guess
    return vocab[arr.findIndex((val) => val === max)];  // Vocab index of our guess
}

if (oneHotToWord(wordToOneHot('libero')) !== 'libero') throw new Error('onehot/word conversion is broken');

async function main() {
    const trainingSet = vocab.map((word) => ({input: wordToOneHot(word), output: wordToOneHot(word)}));

    var network = new n.Network(
        trainingSet[0].input.length,
        trainingSet[0].output.length
    );
    await network.evolve(trainingSet, {rate: 0.05, error: 0.03});

    const answers = [
        network.activate(wordToOneHot('Ad')),
        network.activate(wordToOneHot('libero')),
        network.activate(wordToOneHot('officiis')),
    ]

    answers.forEach((answer) => {
        console.log(Math.max(...answer), oneHotToWord(answer));
    });
}

main().catch((err) => console.error(err.stack));

$ node index2.js
0.9577011902425551 'illum'
0.5646135919078905 'voluptatum'
0.11638099585491998 'voluptas'
308.32s user 17.64s system 141% cpu 3:49.64s total

$ node index2.js
0.6241707852550931 'Eum'
0.2394144596577064 'dolorum'
0.4665537543785485 'iure'
290.26s user 16.93s system 139% cpu 3:40.09s total

wagenaartje commented 7 years ago

I like your coding style, very clean! But what do you mean by training an 'identity function', cause nowhere in your code do I see the use of the identity function.

And what exactly are you training? Because if it is a dataset with relations over time you should definitely specify that the .evolve function should use methods.mutation.ALL.

spiffytech commented 7 years ago

Thanks!

By training an identity function, I mean I'm trying to train the neural net about a direct mapping between the input bits and the output bits, so that if I activate with a one-hot encoded word, I get the same one-hot encoded word back out.

I was originally trying to evolve a network following the same training procedure as word2vec, but I wasn't getting sensible activation outputs. So I started simplifying to the easiest similar problem I could think of (this), and found I still wasn't getting sensible outputs.

wagenaartje commented 7 years ago

Hmm. I tried some small tests with limited time and noticed that for small datasets it would perform fine.

However, keep in mind that the 'identity' problem is not as easy to the neuro-evolution algorithm as it is to us (=humans). A network itself is not aware of its topology, and thus it is very hard for it to remove all connections from the input nodes to an output node except for one (which is coincidentally the same index, but the algorithm does not know).

I'll give it a few more tries later. But have a look at all the options there are for neuro-evolution, most of the times these issues are a trial and error process. (Btw, rate is not an option for the .evolve function)

spiffytech commented 7 years ago

That explanation makes sense. I'm still confused about how the network decides it's finished training (presumably reaching error <= 0.03) if it gets all my tests wrong.

wagenaartje / neataptic

Trouble training an identity function #77