I experimented a bit more with this based on @mjwillson's suggestion.
Amazingly, it seems that uses file-like objects in Xarray does actually work as used here, though making a local copy might still have better performance.
What doesn't work yet -- but hopefully with small upstream changes to Xarray could work -- is passing xarray datasets opened with these file-like objects into a Beam pipeilne. That could let us do the actual data loading from netCDF in separate workers, which could be quite a win!
I experimented a bit more with this based on @mjwillson's suggestion.
Amazingly, it seems that uses file-like objects in Xarray does actually work as used here, though making a local copy might still have better performance.
What doesn't work yet -- but hopefully with small upstream changes to Xarray could work -- is passing xarray datasets opened with these file-like objects into a Beam pipeilne. That could let us do the actual data loading from netCDF in separate workers, which could be quite a win!
_Originally posted by @shoyer in https://github.com/google/xarray-beam/pull/31#discussion_r696246752_