Open jaharvey8 opened 8 months ago
Arviz uses zarr so I dont think it would. Just curious though why is netcdf causing problems in s3?
So I'm definitely not an S3 expert, but if I'm running my model in a Sagemaker notebook and attempting to save the trace to netcdf on S3 I tend to get a lot of cryptic errors. It seems possibly related to this https://github.com/pydata/xarray/issues/2995#issuecomment-497026828
What I started doing was saving the trace to the local Sagemaker environment and then moving to S3. But I wasn't able to load the trace directly from S3 so I ended up having to copy it back and then load it and that all seemed like too much trouble.
But if I do trace.to_zarr() it works just fine.
Ah got it. If its just the save and load from s3, have you tried using the zarr methods built into ArviZ? Do those work for you?
https://python.arviz.org/en/stable/api/generated/arviz.InferenceData.from_zarr.html
Im looking at this more closely and since its deferring to idata you might be able to do this easily I hope!
def save(self, fname: str, format="netcdf"):
if format="zarr":
...
https://github.com/pymc-devs/pymc-experimental/blob/main/pymc_experimental/model_builder.py#L383
I've had a lot of issues saving netcdf files on Amazon S3. Any opposition to adding .to_zarr to the model builder save function? If not, I can go ahead and create a pull request. Would probably add a input from the user indicating if they intended a netcdf file or zarr, but have netcdf the default.