LenzDu / Kaggle-Competition-Favorita

5th place solution for Kaggle competition Favorita Grocery Sales Forecasting
MIT License
250 stars 96 forks source link

ValueError: cannot reshape array of size 58590896 into shape (365,1) #2

Open tyokota opened 6 years ago

tyokota commented 6 years ago

I'm getting the following error in the seq2seq.py script:

Xval, Yval = create_dataset(df, promo_df, items, stores, timesteps, date(2017, 7, 26),
                            aux_as_tensor=True, reshape_output=2)
Traceback (most recent call last):

  File "<ipython-input-58-9c289d7052b8>", line 2, in <module>
    aux_as_tensor=True, reshape_output=2)

  File "<ipython-input-57-2ef1d01c4d2b>", line 21, in create_dataset
    return create_dataset_part(df, promo_df, cat_features, item_group_mean, store_group_mean, timesteps, first_pred_start, reshape_output, aux_as_tensor, is_train)

  File "<ipython-input-53-550ff7ca6a08>", line 24, in create_dataset_part
    X = X.reshape(-1, timesteps, 1)

ValueError: cannot reshape array of size 58590896 into shape (365,1)

Any thoughts as to why it seems we need one less timestep in the function?

LenzDu commented 6 years ago

Stores were closed on Chrismas so the data were missing on that day. Adding records of 0 with any store-item combo on every Dec 25th in the original data, or inserting a column with all 0s on every Dec 25th in the unstacked Dataframe may fix this problem. Sorry for forgetting this issue :)

tyokota commented 6 years ago

Ah, okay. Do you remember how did you do this in Python? I had seen some scripts in the kernels that did this except it was to add every possible pairing, which blew up the size of the dataframe.

59ranjbar commented 5 years ago

Could you please explain how you have done it in python. Thanks