xarray-contrib / xeofs

Comprehensive EOF analysis in Python with xarray: A versatile, multidimensional, and scalable tool for advanced climate data analysis
https://xeofs.readthedocs.io/
MIT License
105 stars 20 forks source link

Critical bug: model.save() overwrites entire directory contents #106

Closed nicrie closed 1 year ago

nicrie commented 1 year ago

Describe the bug When specifying a directory instead of a file path to save a model, the function erroneously overwrites all data within the given directory. This behavior is not only unexpected but also extremely risky—I've just inadvertently lost files this way.

Reproducible Minimal Working Example

import xeofs as xe
import xarray as xr

t2m = xr.tutorial.load_dataset("air_temperature")

model = xe.models.EOF()
model.fit(t2m, "time")
model.save("model.zarr")  # <-- that's fine

# WARNING: Executing the line below with a directory path will result in 
# the loss of all data within the specified directory. Use with extreme caution!
# --------------------------------------------------------------------------------------------
# Continue you if you feel that you have nothing to lose...
# model.save("/the/directory/you/might/accidentally/wipe/clean/")  # Hazardous usage!

Expected behavior No data is overwritten.

Desktop (please complete the following information):

I think that requires a quick fix @slevang. I will start working on a patch but please take extra caution when utilizing the save function in the meantime.

slevang commented 1 year ago

Oh wow, sorry about that! For whatever reason, I guess DataTree.to_zarr() has a default of mode="w" which will indeed overwrite the directory contents if it already exists. Dataset.to_zarr() has the much more sensible mode="w-" as default. This is in general a bit of a gotcha of xarray zarr writes (very easy to rm -rf a directory if you use mode="w").

I can fix this in just a few minutes.

nicrie commented 1 year ago

Yes, I just noticed that too. The default value chosen by xarray-datatree seems to me to be quite recklessly chosen. ;)

slevang commented 1 year ago

The default value chosen by xarray-datatree seems to me to be quite recklessly chosen. ;)

Very much so! I will propose this change there as well.

nicrie commented 1 year ago

Very much so! I will propose this change there as well.

I already did, hoping to spare some other poor soul the feeling of digital Armageddon. Thanks for fixing this so quickly!

slevang commented 1 year ago

accidental deletions are no fun, hopefully it wasn't anything too catastrophic :fearful:

nicrie commented 1 year ago

I was in my "playground" folder. Lost a couple of files but nothing too important + there's a backup. Phew!