Closed prakhar6sharma closed 1 year ago
To Do:
I'm 50-50 on this, so you could convince me either way, but why does
name
need to be a parameter forClimateDatasetArgs
. For example, if I useERA5Args
, it just makes sense that the name is"era5"
. A compromise solution could be that the defaultname
is"era5"
, but we still allow the user to specify a different name.
Why name
need to be a parameter for ClimateDatasetArgs
: Suppose we are doing downscaling. For the typical setup that we support, both high_res
and low_res
data are from ERA5
. Thus keeping the name as "era5"
for both of them would lead to ambiguity. Right now, there is a default name for every different ClimateDatasetArgs
class, for ERA5Args
that happen to be "era5"
.
Issues fixed by this PR:
StackedClimateDataset
is hard coded to work forDownscaling
only because it has a different return signature than theERA5
andClimateDataset
. Specifically,Downscaling
assumes that theraw_data
it would receive would be a list with 2 items, one forinput
and other foroutput
. Thus limiting us with a single dataset for input and single dataset for output.data/
folder were all absolute imports.Forecasting
'screate_copy()
allows illegal values for some of it's attributesSolution implemented:
ClimateDataset
has a new attribute calledname
. Thus, all variables are indexed byf"{dataset_name}:{variable}"
.StackedClimateDataset
now allows recursively stacking arbitraryClimateDataset
's while keeping the same return signature. Specifically, if it has two child datasets named"child1"
and"child2"
and it's own name is"parent"
, now the name of the children are changed to"parent:child1"
and"parent:child2"
.in_vars
,out_vars
, andconstants
of theTask
should have the dataset name followed by a colon followed by the variable name. Example:["2m_temperature", "geopotential"]
to["era5:2m_temperature", "era5:geopotential"]
.data/
folder to follow relative paths.create_copy()
forForecasting
.