Closed TomNicholas closed 4 years ago
How does this change eliminate the possible need for drop_variables
? I guess we still need the special handling for _BOUT_PER_PROC_VARIABLES
, and if so what happens if some user code happens to add some variable to the output that behaves in a similar way (e.g. a scalar with different values in each BOUT.dmp.*.nc
file)? Wouldn't there be an error unless that variable is dropped explicitly?
How does this change eliminate the possible need for drop_variables?
Because drop_variables
is an argument to xarray.open_dataset
, which kwargs
will pass down to. So it's still available as an option, it just doesn't need to be explicitly listed as an argument to open_boutdataset
.
what happens if some user code happens to add some variable to the output that behaves in a similar way
Because open_boutdataset
occurs before preprocess
or the combining process within open_mfdataset
, passing drop_variables='problem_variable'
to open_boutdataset
should still work fine.
Wouldn't there be an error unless that variable is dropped explicitly?
Yes, but we will still have the option to drop it explicitly.
Ah, I see. Sorry, I'd looked at open_mfdataset
and saw it didn't have a drop_variables
argument, but didn't realise that open_dataset
does and the kwargs
will drop through to there.
kwargs
are great, but they do make the help()
less helpful sometimes
Generalised
open_boutdataset
to accept arbitrary kwargs. The kwargs go toxarray.open_mfdataset
first, and if they aren't recognised there they go down toxarray.open_dataset
.This allows you to potentially speed up opening many files by passing
data_vars='minimal', coords='minimal', compat='override', parallel='True'
as described here.This also means the
drop_variables
argument doesn't need to be explicitly there anymore as it's covered by the kwargs being passed down toxarray.open_dataset
.