entropy-lab / entropy

BSD 3-Clause "New" or "Revised" License
30 stars 13 forks source link

To reduce DB file size: SqlAlchemyDB now only writes results/metadata to HDF5 file when enable_hdf5_storage==True #286

Closed urig closed 2 years ago

urig commented 2 years ago

This PR changes the behavior of SqlAlchemyDB when saving results and metadata (when enable_hdf5_storage==True which is the default behavior).

Previously results/metadata were saved both to the SQLite DB and to HDF5 files. Now they are only saved to HDF5 files.

The motivation is to remove redundnancy and reduce the size of the SQLite DB.

github-actions[bot] commented 2 years ago

Unit Test Results

241 tests   237 :heavy_check_mark:  58s :stopwatch:     1 suites      4 :zzz:     1 files        0 :x:

Results for commit 3f4ad6d9.

:recycle: This comment has been updated with latest results.

galwiner commented 2 years ago

@urig does a migration remove data from the SQL db of a database created prior to this change?

urig commented 2 years ago

@urig does a migration remove data from the SQL db of a database created prior to this change?

No. Existing records remain as is. The reason is that these are backup records in case there were errors when saving to HDF5.

That having been said, would you like me to add a migration that deletes them?

urig commented 2 years ago

@galwiner I've added a migration that removes Results and ExperimentData records where saved_in_hdf5 is true.

liorella-qm commented 2 years ago

hi, just to make sure I'm in sync: previously the results were stored once in the entropy DB and once in the HDF5 files indepedently, and this is why the files are so big?

urig commented 2 years ago

Hi @liorella-qm,

Yes. A part of the database being so big is that experiment results and metadata were stored both in it and in HDF5 files. This was a safety measure taken against possible data loss back when the HDF5 storage feature was new. This duplicity has been removed so results and metadata are written only to HDF5 files by default.