entropy-lab / entropy

BSD 3-Clause "New" or "Revised" License
30 stars 13 forks source link

Making InProcessParamStore multi-processing-safe using lock file #294

Closed urig closed 2 years ago

urig commented 2 years ago

This PR resolves https://github.com/entropy-lab/entropy/issues/223 by making it safe for multiple different processes on multiple different machines to use InProcessParamStore on a single Entropy ParamStore DB file (params.json). "Safe" here means that no writes are lost due to concurrent access to the DB file and that reads are always consistent with the most recent writes across processes.

The solution implemented here is based on a "lock file" that must be exclusively acquired by a process before the process can read from, or write, to the ParamStore DB file. This is done using the filelock library.

The PR also includes the following changes:

  1. Disables TinyDB's query caching feature which is not safe in a multi-process scenario.
  2. InProcessParamStore ctor now checks out latest commit (if there is one) automatically. As a result users can no longer commit() an empty store.
  3. The checkout() method, when called without arguments, checks out the latest commit (if there is one).
  4. Fixes a bug where __base_commit_id was not set when checkout() was called with commit_num or move_by arguments.
  5. Fixes a bug where list_values() would return an extra DataFrame record for the current value in the store even when the store was not dirty.
github-actions[bot] commented 2 years ago

Unit Test Results

249 tests   245 :heavy_check_mark:  53s :stopwatch:     1 suites      4 :zzz:     1 files        0 :x:

Results for commit 0f88f035.

:recycle: This comment has been updated with latest results.