Chapter 10: Discrepancy between problem statement and Keras implementation in timeseries_dataset_from_array()

Description: Reading the section 10 Deep learning for timeseries, there appears to be a potential discrepancy between the problem statement and the actual implementation.

Problem Statement: The problem statement, as described in section 10.2.1, outlines a scenario where temperature data and other variables for 5 days, sampled once per hour, are provided. The objective is to predict the temperature 24 hours ahead.

Concern: According to the problem statement, there are 120 samples in 5 days (24 samples per day). The dataset should consist of sequences representing 5 days of data, with each sequence containing a maximum of 120 samples.

Keras Implementation: However, when utilizing the timeseries_dataset_from_array() function with parameters sampling_rate = 6 and sequence_length = 120, it generates sequences corresponding to 30 days (4 samples per day). This seems to deviate from the problem statement's objective of predicting temperature with data from 5 days, not 30.

Proposed Solution: One potential solution could be adjusting the sequence_length parameter to 20. This adjustment would ensure that sequences contain data from 5 consecutive days (4 samples per day using sampling_rate = 6), aligning with the problem statement's requirements.

Request for Clarification: I'd appreciate clarification on whether my analysis is accurate and if the implementation aligns with the intended problem statement. If not, guidance on how to correctly utilize the timeseries_dataset_from_array() function for the specified problem would be valuable.

Thank you for your attention to this matter.

fchollet / deep-learning-with-python-notebooks

Chapter 10: Discrepancy between problem statement and Keras implementation in timeseries_dataset_from_array() #238