Open mmuratarat opened 5 years ago
Hello mmuratarat, Hi,
I've worked on Adanet framework upon RNN type network (Simple RNN, LSTM and GRU).
In order to train mixture weights, Adanet may use logits or last layer from subnetwork candidates. Then I force Adanet to use logits rather then last layer in creating Adanet Estimator with instructions :
from adanet.ensemble import MixtureWeightType
from adanet.ensemble.weighted import ComplexityRegularizedEnsembler
ensembler = ComplexityRegularizedEnsembler(mixture_weight_type=mixture_weight_type, adanet_lambda=1.e-3)
list_ensembler = [ensembler]
adanet_estimator = adanet.Estimator(
ensemblers = [ensembler],
max_iteration_steps=500,
subnetwork_generator = ...,
head = ...,
config= ...,
evaluator=adanet.Evaluator(
input_fn=...,
steps=None))
....
results, _ = tf.estimator.train_and_evaluate(adanet_estimator, train_spec, eval_spec)
Hope this will help
@mmuratarat: We've successfully train all kinds of RNNs with different cells like lstm
, cudnn_lstm
, and gru
. Like @dataforcast mentions, you will need to create a custom adanet.subnetwork.Generator
and adanet.subnetwork.Builder
subclasses that use tf.nn.dynamic_rnn()
. You can look at SimpleDNN for inspiration.
Hello! I'm a green hand in adanet. Would you please tell me if I have to build the LSTM model myself rather than using the LSTM API provided in kereas or tensorflow when using adanet? If so, would you please give an example about using LSTM with adanet(the sample you use for test RNN). Thank you very much.
@tx2010011751 You should be able to use the tf.contrib.estimator.RNNEstimator with adanet.AutoEnsembleEstimator
if you want to try ensembling RNN models.
As a first step, try getting tf.contrib.estimator.RNNEstimator
to train on its own, and next try it in adanet.
I can easily understand why DNN and CNN apply well to AdaNet since all subnetworks have the same architecture and hyperparameters. However, I am curious if it is possible to apply LSTM to AdaNet. I cannot fully comprehend how to incorporate
tf.nn.dtynamic_rnn()
, because when we have multiple LSTM cells it will be as the following:outputs, states = tf.nn.dynamic_rnn([cell1, cell2,..., cell_n], inputs = X, swap_memory = False, time_major = False, dtype = tf.float32)
Can you provide any insight about that? Thank you so much!