keras-team / keras-io

Keras documentation, hosted live at
Apache License 2.0
2.74k stars 2.03k forks source link

Inconsistency when running the keras-io/examples/timeseries/ #1908

Open condor-cp opened 1 month ago

condor-cp commented 1 month ago

Running the example shows inconsistency in number of parameters and model performance compared to what is displayed.

It seems that the global average pooling should take data_format to "channel_first" to reach the same number of parameters and the accuracy performance consistent with the displayed console log (tried with google colab).

But then there is no pooling, just removing the feature dimension => Maybe another layer should be used.

mw66 commented 2 days ago

Experienced the same problem:


    x = layers.GlobalAveragePooling1D(data_format="channels_last")(x)                                                                                                                                              

the number of parameters are (only show the last few rows that differ):

 global_average_pooling1d (Glob  (None, 1)           0           ['tf.__operators__.add_7[0][0]'] 

 dense (Dense)                  (None, 128)          256         ['global_average_pooling1d[0][0]'

 dropout_8 (Dropout)            (None, 128)          0           ['dense[0][0]']                  

 dense_1 (Dense)                (None, 2)            258         ['dropout_8[0][0]']              

Total params: 29,258
Trainable params: 29,258
Non-trainable params: 0

And the training stop very quickly with bad result:

45/45 [==============================] - 24s 545ms/step - loss: 0.6922 - sparse_categorical_accuracy: 0.5208 - val_loss: 0.6952 - val_sparse_categorical_accuracy: 0.4799
42/42 [==============================] - 4s 91ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5159

While on page:

it shows:

│ global_average_poo… │ (None, 500)       │       0 │ add_7[0][0]          │
│ (GlobalAveragePool… │                   │         │                      │
│ dense (Dense)       │ (None, 128)       │  64,128 │ global_average_pool… │
│ dropout_12          │ (None, 128)       │       0 │ dense[0][0]          │
│ (Dropout)           │                   │         │                      │
│ dense_1 (Dense)     │ (None, 2)         │     258 │ dropout_12[0][0]     │
 Total params: 93,130 (363.79 KB)
 Trainable params: 93,130 (363.79 KB)
 Non-trainable params: 0 (0.00 B)
mw66 commented 2 days ago

Hi , @fchollet

This line need to be changed to channel_first to see the good result on:

Can you make this change? can also add explanation why channel_first is need instead of channels_last?

The input training data's shape is (3601, 500, 1), i.e channels_last for sure; but why we need to set channel_first to see the good training result?
