keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
62k stars 19.48k forks source link

While retraining a model I get worse values for MSE from the last epoch of last trainment to the first of the next #19813

Closed Dtar380 closed 3 months ago

Dtar380 commented 5 months ago

Im using a Sequential model to make a model based on regresion.

This is the code im using for training:

def massTrain(self):

        csvs = listdir('Data')

        for i ,csv in enumerate(csvs):

            print(f'\nTraining with {csv}\n')

            if i < len(csvs) - 1:
                self.setData(csv, 1)
            else:
                self.setData(csv, 0.9)

            if (test_data):
                self.predictions = self.test(test_data)

                y_test = self.dataset[self.training_data_len:, :]

                self.mse = mean_squared_error(y_test, self.predictions)
                self.rmse = sqrt(self.mse)

                print(f"\nRMSE was: {self.rmse}\n")

def train(self, train_data):

        # Split data into x_train and y_train data sets
        x_train = []
        y_train = []

        for i in range(self.input_shape, len(train_data)):
            x_train.append(train_data[i - self.input_shape:i, 0])
            y_train.append(train_data[i, 0])

        # Convert x_train and y_train to NP arrays
        x_train,y_train = np.array(x_train), np.array(y_train)

        # Reshape the data
        x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

        # Train the model
        earlystopping = callbacks.EarlyStopping(monitor="loss",mode="min",patience=6,restore_best_weights=True)

        self.model.fit(x_train , y_train, batch_size=16, epochs=50, callbacks=[earlystopping])

The rest of the code is irrelevant because its able to train and save and plot graphs after training.

Im using 10 csv files that holds data, so I iterate over them using listdir() to get all the files, I then train the model using all those files, for the last train dataset I just use 90% of the data for then testing the model with the other 10% and ploting a graph.

What im getting is that for example, when training the modle with the first dataset, on the last epoch of the trainment I get a MSE of 7e-4, and then in the, when using the next dataset I get on the first epoch 0.0012, which is a lot more actually, 5e-4 more, that is a 58% less acuracy.

Is there something im doing wrong when retraining the model, because the only thing I think it could be is that the weights are not being stored after fitting the model and its starting from scratch every time, and therefore all the fitting is useless.

Dtar380 commented 5 months ago

I updated the code because it wasnt right, I took it from my test file on accident and it was not the one Im running, mainly beacuse it wouldnt work.

Now, to add some usefull data, heres the compilation of the model:

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error', metrics=[RootMeanSquaredError()])

And here is the REAL data from the training Im doing right now and as im using a callback Earlystopper that retakes the best value Im going to include 3 rows for first dataset training.

First dataset training:

divyashreepathihalli commented 4 months ago

@Dtar380 I have a few questions, it would help me understand your issue better

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 3 months ago

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 3 months ago

Are you satisfied with the resolution of your issue? Yes No