Open cmdupuis3 opened 1 year ago
Hey @maxrjones, does this serve any purpose? It's incredibly annoying to get through batch generation only to crash because I forgot to rename the dimensions I'm subsetting.
Partial example:
bgen = xb.BatchGenerator(
ds,
{'nlon':nlons, 'nlat':nlats},
concat_input_dims=True
)
sub = {'nlon':range(halo_size,nlons-halo_size),
'nlat':range(halo_size,nlats-halo_size)}
for batch in bgen:
batch_input = [batch[x][sub] for x in ['SSH', 'SST']]
This will crash because the names of batch_input
's dimensions are now nlon_input
and nlat_input
, but if concat_input_dims=False
the dim names stay the same.
Just learned that xarray rolling adds "_input" (or something similar) also, and it's used to distinguish between the original dimensions (which may still exist) and the new stencil dims.
I'm thinking that this looks superfluous in xbatcher because (at least in my case) the original dimensions are always stacked. Maybe "_input" makes sense if they aren't stacked?
What is your issue?
Title. The problem is that changing dimension names makes it difficult for the user to index into batched arrays in a batch loop. This is particularly annoying because changing the value of
concat_input_dims
will change this behavior, sometimes appending_input
, sometimes not, which makes debugging and experimentation difficult. I view this as an unwelcome side effect, and I'd prefer the non-batched dimensions keep their original names.