keras-team / tf-keras

The TensorFlow-specific implementation of the Keras API, which was the default Keras from 2019 to 2023.
Apache License 2.0
60 stars 28 forks source link

Neural Network generates zeros as predictions #82

Open vrunm opened 1 year ago

vrunm commented 1 year ago

I have the data of financial transactions and I am trying to fit a neural network to it. Here is some background on the data.

I have stored it in a dataframe which has date as index and transactions (debit) of each branch in a separate column, with its column name as branch_id.

I used pandas interpolate method to fill the missing values with filling method = time i.e. df.interpolate(method='time', axis=0, inplace=True)

Since the banking happens only during business days, therefore there was no data present for gazetted holidays and weekends. So, I filled zeros against those dates. df.fillna(0, inplace=True) I have used the keras sequential model to create a neural network model. The code of which is given below

 from keras.models import Sequential
 from keras.layers import Dense, LSTM, Flatten, BatchNormalization, Dropout
 from keras.optimizers import Adam

 model = Sequential()
 model.add(BatchNormalization())
 model.add(Dense(32 , activation='relu'))
 model.add(Dense(64, activation='relu'))
 model.add(Dense(128, activation='relu'))
 model.add(Flatten())
 model.add(Dense(1, activation='relu'))
 model.compile(loss='mean_squared_error', optimizer=Adam())
 model.fit(X, y, epochs=epochs, batch_size=31, verbose=0)

Now when I fit the model on each branch data separately, the model fits very well on some branches while for rest it generates zeros when I generate predictions, even using the same training data.

Below is the code that I am using:

def train_neural_network(actual_data, tbats_predictions, model, batch_size=31, epochs=100):

    X = tbats_predictions
    y = actual_data

    model.add(BatchNormalization())
    model.add(Dense(32 , activation='relu'))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(128, activation='relu'))
    model.add(Flatten())

    model.add(Dense(1, activation='relu'))

    model.compile(loss='mean_squared_error', optimizer=Adam())

    # Train the model in batches
    model.fit(X, y, epochs=100, batch_size=31, verbose=0)

    # Return the trained model
    return model

hybrid_forecast = {}

for col in df_dep.columns:
    act = np.array(train_data_deposit[col])
    act = act.reshape(-1, 1)

    preds = np.array(predictions[col])
    preds = preds.reshape(-1, 1)

    model = Sequential()
    model = train_neural_network(preds, act, model)

    forecast = model.predict(validation_data_deposit[col])

    hybrid_forecast[col] = forecast.squeeze()

I am not sure what is the reason for it. I have inspected the training_data_dep and validation_data_deposit for any missing values or NaNs but they are no present. Does it happen due to presence of zeros at weekends, if so then why does it work for some of the branches while fail for others?

tilakrayal commented 1 year ago

@vrunm, I was facing a different issue while executing the mentioned code. Kindly find the gist of it here and provide complete dependencies to analyse the issue. Thank you!

vrunm commented 1 year ago

Also in the code you provided your error is related to loading the data. I changed model.add(Dense(1, activation='relu')) to model.add(Dense(1, activation='linear'))

By changing the activation function to linear, my problem was solved but I don't understand why Relu was causing issues and how does changing it to linear solves the problem.

athnzc commented 1 year ago

Hello, have you perhaps checked your data for any negative values, or otherwise values that don't make sense for your dataset? Regarding negative values specifically, there is a chance you get zeros because ReLU outputs 0 when receiving negative input. Also, have you normalized your dataset? I can see in your code that you use batch normalization for the input layer, however, while you can use batch normalization instead of scaling your data, the same source states that this is not effective for all cases and therefore it might be wise to normalize your data beforehand. All of these are more technical details and I'm not sure they would necessarily help (I'm still learning myself). Also, it's not very clear to me what are you trying to accomplish. What is your target variable/what are you trying to predict? I get the feeling that this is a regression problem, if that's the case, then the linear activation function in the output layer seems appropriate.

vrunm commented 1 year ago

I have checked my data for negative values. Also the values in my dataset are normalized and scaled.