Check architectures of FolkRNN and Melody-RNN

hedonistrh commented 5 years ago

One-line summary (optional)

Learn the which architectures(layers, dropout rates etc.) are used by Melody-Rnn and FolkRNN.

Reason

We would like to reproduce at least one of RNN based method.

Design

Check their code.
Read their paper.
Create summary about the architectures

Acceptance Criteria

At least one of the team member should accept my summary is same with the paper-code.

Outcome

Summary about their architectures.

hedonistrh commented 5 years ago

Folk-RNN

Paper and Summary of the Paper
Source Code
_I can not find the how many samples are used to guess next sample. I will try to understand this._

Melody RNN

Source Code
BlogPost
I can not find the how many samples are used to guess next sample, when we train it via paper and source code. I will try to use TensorBoard to see exact architecture.

hedonistrh commented 5 years ago

If I am not wrong, for basic RNN, they give 102 note as input to guess next one. I use TensorBoard to get this information via their Docker Image.

hedonistrh commented 5 years ago

For Folk RNN, I have found different paper.

We use two approaches to modelling and generating ABC. The architecture of our models involves 3 hidden layers with 512 LSTM units each, and a softmax output layer given the distribution over the vocabulary conditioned on the one-hot encoded input. We train each model by backpropogation with one-hot encoded vectors, a mini-batch approach (batch size 50), and with drop out of 0.5. We train our first model using dataset A, with sequences of 50 characters. In the above ABC, examples of characters are “M”, “>” and “:”. The size of the vocabulary is 134.

Now, we can, hopefully, reproduce their architecture. ⛱

hedonistrh commented 5 years ago

This is the code snippet for Folk RNN with keras. Config file - Related paper - Train Code

def step_decay(epoch):
    initial_lrate = 0.003
    drop = 0.97
    epochs_drop = 20.0
    lrate = initial_lrate * math.pow(drop,  
            math.floor((1+epoch)/epochs_drop))
    return lrate
# lrate = LearningRateScheduler(step_decay(epoch_num)) 
# Note that, we will use decay learning rate, when we train the model.

rnn_size = 512
vocab_size = 134
seq_length = 50
clip_value = 5
model = Sequential()
model.add(LSTM(rnn_size, activation="tanh", return_sequences=True, 
                                kernel_initializer='glorot_uniform',
                                bias_initializer=initializers.Constant(5),
                                input_shape=(seq_length, vocab_size)))
model.add(LSTM(rnn_size, activation="tanh", return_sequences=True, 
                                kernel_initializer='glorot_uniform',
                                bias_initializer=initializers.Constant(5)))
model.add(LSTM(rnn_size, activation="tanh",
                                kernel_initializer='glorot_uniform',
                                bias_initializer=initializers.Constant(5)))
model.add(Dropout(0.5))
model.add(Dense(vocab_size))
model.add(Activation('softmax'))
optimizer = RMSprop(lr=0.001, clipvalue=clip_value)

hedonistrh commented 5 years ago

Code snippet is ready for basic_rnn 😋

They use SparseSoftmaxCrossEntropyWithLogits.
Adam as an optimizer.
BasicLstm for the each layer. For more info about BasicLstm, please check this and this.

Dropout is 0.5, Learning Rate is 0.001 and Clip Norm is 5. source


rnn_size = 128
vocab_size = 84-48
seq_length = 108
clip_value = 5
model = Sequential()
model.add(LSTM(rnn_size, activation="tanh", return_sequences=True,
            input_shape=(seq_length, vocab_size)))

model.add(LSTM(rnn_size, activation="tanh")) model.add(Dropout(0.5)) model.add(Dense(vocab_size)) model.add(Activation('softmax')) optimizer = Adam(lr=0.001, clipvalue=clip_value)

hedonistrh commented 5 years ago

For attention_rnn

They use mostly same hyperparameters with basic_rnn. Just Clip Norm is 3 and Attention Length is 40.
Unfortunately, Keras has not Attention wrapper. So that, we can check this issue to use it or we can skip it.

Ps. For attention check this.

hedonistrh / TurkishMusicGeneration

Check architectures of FolkRNN and Melody-RNN #34