Closed daanvdn closed 3 years ago
@daanvdn Thanks for the issue, I agree we could use some documentation on this issue.
The more feedback and input we get from you the better. Have you found any of our documentation pages to be helpful or lacking? If so which pages?
hi @tomthetrainer, a while ago @AlexDBlack pointed out to me that my configuration above can't be right because I have an output layer in the middle of the network. Are there any snippets in the examples repo to show how it should be done? thanks
Hi @daanvdn I could not find any example code. Perhaps @eraly or @turambar have some insight.
I have a request for documentation relating to the combined use of unsupervised pretraining and supervised training.
The documentation online, the dl4j book and the dl4j-examples repo all give good information on how to configure a standalone deep neural net that does unsupervised training, for instance using stacked denoising autoencoders. All of this documentation also explains what neural nets of this type can be used for, e.g. for learning better feature weights, which can then be used as inputs when training a classifier in a supervised fashion.
What I am missing in terms of documentation though are guidelines and code samples that show how to set up a single MultiLayerConfiguration that combines both pretraining and training layers.
Looking at the implementation of
org.deeplearning4j.nn.multilayer.MultiLayerNetwork#fit(org.nd4j.linalg.dataset.api.iterator.DataSetIterator)
I've tried to figure out how such a configuration might look like. This is what I've come up with:This config would pass the input (which is time-series data) to a stacked denoising autoencoder, reducing the 7000 features to 300 and then use this output as input for a mixed feedforward + lstm network to train a multi-label classifier.
Questions:
This leads me to wonder whether the solution would be to train two separate
MultiLayerNetwork
s (one for pretraining and another for the actual training, each 10 epochs)? This would not be ideal obviously..Still related to the epoch question: Can the
EarlyStoppingTrainer
be used with a config that combines pretraining and training?MultiLayerNetwork#computeGradientAndScore
andMultiLayerNetwork#epsilon
to capture the contribution of inputs to the predicted class(es)?Thanks in advance!!!