uchicago-cs / deepdish

Flexible HDF5 saving/loading and other data science tools from the University of Chicago
http://deepdish.io
BSD 3-Clause "New" or "Revised" License
271 stars 60 forks source link

`save` function with mode 'a' #14

Closed gaow closed 8 years ago

gaow commented 8 years ago

Currently dd.io.save overwrites target file if exists. Would it be a good idea to add a mode argument so that the target file can be constantly updated as new data arrives?

gustavla commented 8 years ago

Can you describe how new data arrives? You mean new groups or is it an array that grows?

Would it have semantics similar to d1.update(d2), if d1 and d2 are dictionaries and d1 represents the already saved file and d2 the new?

gaow commented 8 years ago

I meant new groups for my application. I work with data that contains many groups, too many to keep in RAM all at once and my computational routine analyzes one batch of groups at a time and write to file. Ideally all groups can be written to the same file if there is a a. model. Thanks a lot!

gustavla commented 8 years ago

It's a good idea and I will keep it in mind. I would also welcome a PR with a more concrete suggestion if you or anyone else want to contribute.