xinsuinizhuan commented 2 years ago

It's easy to overfit, so add some dropout layer could solved this problem?

josephjaspers commented 2 years ago

Hi! How are you doing?

I haven't worked on this in quite some bit. (Though I have thought about splitting apart the Vector library and the neural-network library)!

I could look into adding drop out. (It should just be adding a masking layer which actually shouldn't be too hard) though I am not sure when I will start working on it!

I imagine Tensoflow/Pytorche would be an easier way to tackle your problem though :)

xinsuinizhuan commented 2 years ago

It is a good news for your reply! I am fine, too. This fordcast item need to continues，I write this item with C++, but the Tensoflow/Pytorche is python. The overfit problem need to solve, I got some fordcast struct, as:

1、 model = Sequential() model.add(LSTM(units=50, return_sequences=True, input_shape=(forecast_features_set.shape[1], 1))) model.add(Dropout(0.2)) model.add(LSTM(units=50, return_sequences=True)) model.add(Dropout(0.2)) model.add(LSTM(units=50, return_sequences=True)) model.add(Dropout(0.2)) model.add(LSTM(units=50)) model.add(Dropout(0.2)) model.add(Dense(units = 1)) model.compile(optimizer = 'adam', loss = 'mean_squared_error')

2、 model = Sequential()

layer 1: LSTM

model.add(LSTM( input_dim=1, output_dim=150, return_sequences=True)) model.add(Dropout(0.2))

layer 2: LSTM

model.add(LSTM(output_dim=200, return_sequences=False)) model.add(Dropout(0.2))

layer 3: dense

linear activation: a(x) = x

model.add(Dense(output_dim=1, activation='linear'))

show model

model.summary()

compile the model

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

model.compile(loss='mse', optimizer="rmsprop")

train the model

model.fit(X_train, y_train, batch_size=512, nb_epoch=100, validation_split=0.05, verbose=2)

save model

model.save('../model/dwtlstm'+type+'.h5')

3、

design network

model = Sequential() model.add(LSTM(64, input_shape=(train_X.shape[1], train_X.shape[2]))) model.add(Dense(1)) model.compile(loss='mae', optimizer='adam')

fit network

history = model.fit(train_X, train_y, epochs=50, batch_size=72, validation_data=(test_X, test_y), verbose=2, shuffle=False)

plot history

pyplot.plot(history.history['loss'], label='train') pyplot.plot(history.history['val_loss'], label='test') pyplot.legend() pyplot.show()

4、 model = Sequential() layers = [1, 75, 100, prediction_steps] model.add(LSTM(layers[1], input_shape=(None, layers[0]), return_sequences=True)) # add first layer model.add(Dropout(0.2)) # add dropout for first layer model.add(LSTM(layers[2], return_sequences=False)) # add second layer model.add(Dropout(0.2)) # add dropout for second layer model.add(Dense(layers[3])) # add output layer model.add(Activation('linear')) # output layer with linear activation start = time.time() model.compile(loss="mse", optimizer="rmsprop") print('Compilation Time : ', time.time() - start) return model

5、

build the model

model = Sequential()

layer 1: LSTM

model.add(LSTM( input_dim=1, output_dim=50, return_sequences=True)) model.add(Dropout(0.2))

layer 2: LSTM

model.add(LSTM(output_dim=100, return_sequences=False)) model.add(Dropout(0.2))

layer 3: dense

linear activation: a(x) = x

model.add(Dense(output_dim=1, activation='linear'))

compile the model

model.compile(loss="mse", optimizer="rmsprop")

train the model

model.fit(X_train, y_train, batch_size=512, nb_epoch=50, validation_split=0.05, verbose=1)

to some extent, the Dropout layer could solved the overfit, so i try to turn to you.

xinsuinizhuan commented 2 years ago

and how about train with gpu？ I use gpu ，so many compile errors, cuda 11.0

xinsuinizhuan commented 2 years ago

how about Dropout and gpu？

josephjaspers / blackcat_tensors

could support Dropout layer? #70

layer 1: LSTM

layer 2: LSTM

layer 3: dense

linear activation: a(x) = x

show model

compile the model

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

train the model

save model

design network

fit network

plot history

build the model

layer 1: LSTM

layer 2: LSTM

layer 3: dense

linear activation: a(x) = x

compile the model

train the model