Closed jeffheaton closed 5 years ago
+1 to this
would it be possible to adapt this code:
https://github.com/JANNLab/JANNLab/blob/master/src/main/java/de/jannlab/generator/LSTMGenerator.java#L248 https://github.com/JANNLab/JANNLab/blob/master/examples/de/jannlab/examples/generator/LSTMGeneratorExample.java
to do something like Encog's FreeformNetwork.createElman:
+1 too
another +1
I've been meaning to take a stab at extending Encog with LSTM for some time, now.
I'm working on this also. Any collaborators willing to bandy ideas about?
UPDATE: I have it working with encog using freeform networks. I'm using blocks of training rates as per http://arxiv.org/pdf/1206.1106.pdf for the different gates. Anyone interested can contact me :)
Josh
@Joshuaalbert
What's the status of your work? Can you publish it in a repository?
@Turakar It is nearly working. Working on a temporal connectionist training for it, to allow unaligned pattern training.
our latest version is here, it's pure java (not involving any encog). it has an experimental continuous / online learning mode
https://github.com/automenta/narchy/tree/skynet1/logic/src/main/java/nars/learn/lstm
it would be nice to add the new Grid LSTM also
Any updates on this? I'd really like to see it added.
i dont have any updates with any LSTM code. but here's a new variation that might be good too: https://arxiv.org/abs/1709.02755
I am going to close at this point, as I am not actively looking at adding every possible algorithm to Encog. At this point I am mainly adding ones that I have need of and are not well represented elsewhere. Also, it sounds like LSTM's future may be uncertain, based on several sources, but summarized nicely here: https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c74ce0
There have been several requests to add a LSTM network to Encog. Some discussion of it here. http://www.heatonresearch.com/comment/1231#comment-1231
Wikipedia Entry: http://en.wikipedia.org/wiki/Long_short_term_memory More formal description: ftp://ftp.idsia.ch/pub/juergen/lstm.pdf
At this point I am unfamiliar with this architecture, so I am adding this issue to track it. Any suggestions/comments are welcome.
Initial thoughts... could this be implemented with the freeform networks. Or would it be better to create a new MLMethod.