aiidateam / disk-objectstore

An implementation of an efficient "object store" (actually, a key-value store) writing files on disk and not requiring a running server
https://disk-objectstore.readthedocs.io
MIT License
15 stars 8 forks source link

🐛 Fix performance regression #102

Closed chrisjsewell closed 3 years ago

chrisjsewell commented 3 years ago

Container.is_initialised is a costly operation, loading the config JSON every time. In https://github.com/aiidateam/disk-objectstore/commit/1d7c389c353185c1923c9addb1b107c283d5f561, the config is now called on every call to loose_prefix_len, leading to a massive performance degradation.

This PR makes sure the is_initialised test is called only if the config has not already been loaded into memory.

Before:

tests/test_benchmark.py::test_loose_read
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0022  0.0022  0.3835  0.3835  disk-objectstore/disk_objectstore/container.py:824(get_objects_content)
1001    0.0055  0.0000  0.3758  0.0004  disk-objectstore/disk_objectstore/container.py:472(_get_objects_stream_meta_generator)
1000    0.0029  0.0000  0.3058  0.0003  disk-objectstore/disk_objectstore/container.py:213(_get_loose_path_from_hashkey)
3000    0.0019  0.0000  0.2959  0.0001  disk-objectstore/disk_objectstore/container.py:380(loose_prefix_len)
3000    0.0068  0.0000  0.2940  0.0001  disk-objectstore/disk_objectstore/container.py:371(_get_repository_config)
3000    0.0315  0.0000  0.2873  0.0001  disk-objectstore/disk_objectstore/container.py:270(is_initialised)

After:

tests/test_benchmark.py::test_loose_read
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0014  0.0014  0.0793  0.0793  disk-objectstore/disk_objectstore/container.py:824(get_objects_content)
1001    0.0034  0.0000  0.0733  0.0001  disk-objectstore/disk_objectstore/container.py:472(_get_objects_stream_meta_generator)
1000    0.0240  0.0000  0.0240  0.0000  ~:0(<built-in method io.open>)
...
3000    0.0012  0.0000  0.0017  0.0000  disk-objectstore/disk_objectstore/container.py:380(loose_prefix_len)
codecov[bot] commented 3 years ago

Codecov Report

Merging #102 into develop will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##           develop      #102   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            7         7           
  Lines         1502      1502           
=========================================
  Hits          1502      1502           
Impacted Files Coverage Δ
disk_objectstore/container.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1d7c389...e62aabd. Read the comment docs.

chrisjsewell commented 3 years ago

I've also fixed the issue with codecov initially reporting lower test converage, caused by the fact that it was being uploaded (& overwritten) for every matrix run (and just by luck the windows tests are loaded last, which have the full test coverage) See also https://github.com/codecov/codecov-action/issues/40

giovannipizzi commented 3 years ago

Thanks! The fix is right, however the fix on the coverage is wrong. Indeed, as you see now coverage went down because there is a very small fraction of tests that are only run on linux or on mac.

Are you sure that it does not merge? In my experience, it does! It's just that it will prepare a comment as soon as the first build finishes, so at the beginning it seems it didn't use the windows ones, but if you wait at the end it will update the comment and merge the results.

You can see this in the tests for commit 1d7c389 that all pass with 100% coverage.

So I suggest you remove the fix for the coverage and then we merge the actual bug fix

chrisjsewell commented 3 years ago

Are you sure that it does not merge?

Ah good, I was lead to believe otherwise lol