hedonistrh / TurkishMusicGeneration

7 stars 0 forks source link

Check architectures of FolkRNN and Melody-RNN #34

Open hedonistrh opened 5 years ago

hedonistrh commented 5 years ago

One-line summary (optional)

Reason

Design

Acceptance Criteria

Outcome

hedonistrh commented 5 years ago

Folk-RNN

Melody RNN

hedonistrh commented 5 years ago

If I am not wrong, for basic RNN, they give 102 note as input to guess next one. I use TensorBoard to get this information via their Docker Image.

screen shot 2019-03-04 at 22 19 05

<img width="100 alt="screen shot 2019-03-04 at 22 19 12" src="https://user-images.githubusercontent.com/34948815/53763691-b3f56d80-3ecb-11e9-9257-fd6a682e5fee.png">

hedonistrh commented 5 years ago

For Folk RNN, I have found different paper.

We use two approaches to modelling and generating ABC. The architecture of our models involves 3 hidden layers with 512 LSTM units each, and a softmax output layer given the distribution over the vocabulary conditioned on the one-hot encoded input. We train each model by backpropogation with one-hot encoded vectors, a mini-batch approach (batch size 50), and with drop out of 0.5. We train our first model using dataset A, with sequences of 50 characters. In the above ABC, examples of characters are “M”, “>” and “:”. The size of the vocabulary is 134.

Now, we can, hopefully, reproduce their architecture. ⛱

hedonistrh commented 5 years ago

This is the code snippet for Folk RNN with keras. Config file - Related paper - Train Code

def step_decay(epoch):
    initial_lrate = 0.003
    drop = 0.97
    epochs_drop = 20.0
    lrate = initial_lrate * math.pow(drop,  
            math.floor((1+epoch)/epochs_drop))
    return lrate
# lrate = LearningRateScheduler(step_decay(epoch_num)) 
# Note that, we will use decay learning rate, when we train the model.

rnn_size = 512
vocab_size = 134
seq_length = 50
clip_value = 5
model = Sequential()
model.add(LSTM(rnn_size, activation="tanh", return_sequences=True, 
                                kernel_initializer='glorot_uniform',
                                bias_initializer=initializers.Constant(5),
                                input_shape=(seq_length, vocab_size)))
model.add(LSTM(rnn_size, activation="tanh", return_sequences=True, 
                                kernel_initializer='glorot_uniform',
                                bias_initializer=initializers.Constant(5)))
model.add(LSTM(rnn_size, activation="tanh",
                                kernel_initializer='glorot_uniform',
                                bias_initializer=initializers.Constant(5)))
model.add(Dropout(0.5))
model.add(Dense(vocab_size))
model.add(Activation('softmax'))
optimizer = RMSprop(lr=0.001, clipvalue=clip_value)
hedonistrh commented 5 years ago

Code snippet is ready for basic_rnn 😋

model.add(LSTM(rnn_size, activation="tanh")) model.add(Dropout(0.5)) model.add(Dense(vocab_size)) model.add(Activation('softmax')) optimizer = Adam(lr=0.001, clipvalue=clip_value)

hedonistrh commented 5 years ago

For attention_rnn

Ps. For attention check this.