HDFGroup / h5pyd

h5py distributed - Python client library for HDF Rest API
Other
114 stars 38 forks source link

Recreating a dataset name doesn't raise error, results in orphaned dataset at `/__db__/{datasets}/` #19

Closed grisaitis closed 7 years ago

grisaitis commented 8 years ago

If I create a dataset with create_dataset, write to it, and then call create_dataset again with the same name, then (a) no error occurs and (b) it seems like the old one is moved to /__db__/{datasets}/. Behavior (a) differs from h5py, which I think throws a KeyError when a dataset already exists of the name given to create_dataset.

To me (b) seems like a "dataset leak". Is this a feature or a bug?

I'd expect one of two behaviors:

jreadey commented 8 years ago

@grisaitis, I've just checked in changes to the develop branch of h5serv and h5pyd to throw an exception (Runtime exception for h5py compatibility). Please try it out and let me know if that works for you.

It's interesting that you mention the orphan dataset issue. This comes up in the delete link case. See issue: https://github.com/HDFGroup/h5serv/issues/12. I'll be taking a look next week at implementing a garbage collector of sorts.

jreadey commented 7 years ago

Cleaning up old issue - reopen if this is still a problem.