Open michabuehlmann opened 6 months ago
Thanks for your feedback. I've just tried running this code on Colab, and it ran smoothly:
I'm not sure why it's failing for you. Could you please check which versions of Pandas, TensorFlow, and NumPy you're using? Ideally they should match those on Colab, which are currently Pandas 1.5.3, TensorFlow 2.15.0, NumPy 1.23.5. Please let me know if this helps.
Dear Aurélion Geron,
my versions are: Pandas 2.1.4, TensorFlow MacOS 2.15.0, NumPy 1.23.3. I use a MacBook Pro with M1 Max Processor.
Am 01.02.2024 um 00:06 schrieb Aurélien Geron @.***>:
Thanks for your feedback. I've just tried running this code on Colab, and it ran smoothly: image.png (view on web) https://github.com/ageron/handson-ml3/assets/76661/22952fbb-c4c5-4d68-b071-eab5041a42a6 I'm not sure why it's failing. Could you please check which versions of Pandas, TensorFlow, and NumPy you're using? Ideally they should match those on Colab, which are currently Pandas 1.5.3, TensorFlow 2.15.0, NumPy 1.23.5. Please let me know if this helps.
— Reply to this email directly, view it on GitHub https://github.com/ageron/handson-ml3/issues/112#issuecomment-1920142379, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHATSXNTWKNKCWJDWKGE4CDYRLE6VAVCNFSM6AAAAABAP53R4KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRQGE2DEMZXHE. You are receiving this because you authored the thread.
i was trying out Chapter 15 just now and encoutered the same issue. After a bit of bug hunting i found out that the culprit is pandas at the current latest version 2.1.4. if you print the first argument of the timeseries_dataset_from_array function = print(mulvar_train.to_numpy())
with pandas 2.1.4 you'll get this :
[[0.303321 0.319835 True False False]
[0.448859 0.365509 False True False]
[0.34054 0.287661 False False True]
...
[0.394088 0.307105 False True False]
[0.31455 0.26531 False False True]
[0.463165 0.386058 False True False]]
However if you print it using pandas 1.5.3 (currently used in colab), you'll get:
[[0.303321 0.319835 1. 0. 0. ]
[0.448859 0.365509 0. 1. 0. ]
[0.34054 0.287661 0. 0. 1. ]
...
[0.394088 0.307105 0. 1. 0. ]
[0.31455 0.26531 0. 0. 1. ]
[0.463165 0.386058 0. 1. 0. ]]
So i guess that the timeseries function won't work with boolean values.
I think the origin of the boolean values is the pandas api for one-hot encoding pd.get_dummies(df_mulvar)
use this pip install pandas==1.5.3
and it will run with no problem.
As @mario-ct explained, this is due to a change of behavior in the pd.get_dummies()
function with Pandas 2. The default data type for the newly created columns (for the one-hot encoding) is Boolean, which the tf.keras.utils.timeseries_dataset_from_array()
function does not like.
You can fix this by specifying np.float32
as data type of the new columns when creating the one-hot encoding for the df_mulvar
dataframe:
df_mulvar = pd.get_dummies(df_mulvar, dtype = np.float32)
Thanks a lot for your solution, tlac980. Also because I don't have to install an old version of the pandas library. I tested it and it worked also on my machine.
To Reproduce My error is in chapter 15 in the paragraph "Forecasting Multivariate Time Series" (page 559 in the book). In cell 43 is the following code:
I get here a ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type float). See the stack trace:
Expected behavior It should be the base for the code in cell 45:
Screenshots There is no screenshot.
Versions (please complete the following information):
Additional context The following paragraphs build on each other. So I get also errors in this blocks.