Open JackKelly opened 2 years ago
And/or, check against start_time
and end_time
(after those have been implemented by issue #425) - as suggested by Peter in comment https://github.com/openclimatefix/nowcasting_dataset/issues/439#issuecomment-972728948
The issue is that, if the files specifying the spatial and temporal locations of each example are computed with less DataSources than the number of DataSources used to create batches, then we're likely to attempt to sample from locations that don't exist in at least one datasource.
Maybe the mechanism should be:
By default, if
prepare_ml_data.py
is called with at least one--data_source
command line argument, and if the files specifying locations don't exist, then throw an error. But allow users to overwrite this behaviour with a--force_creation_of_locations
flag, or something like that??