aiidateam / disk-objectstore

An implementation of an efficient "object store" (actually, a key-value store) writing files on disk and not requiring a running server
https://disk-objectstore.readthedocs.io
MIT License
15 stars 8 forks source link

General, non-essential improvements #101

Closed chrisjsewell closed 1 year ago

chrisjsewell commented 3 years ago

This is a list of desired, but non-essential improvements noted whilst reviewing https://github.com/aiidateam/disk-objectstore/pull/96

tests/test_benchmark.py::test_pack_write
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0000  0.0000  0.4833  0.4833  disk-objectstore/disk_objectstore/container.py:1556(add_objects_to_pack)
1       0.0484  0.0484  0.4670  0.4670  disk-objectstore/disk_objectstore/container.py:1297(add_streamed_objects_to_pack)
10001   0.0180  0.0000  0.2115  0.0000  disk-objectstore/disk_objectstore/container.py:242(_get_pack_id_to_write_to)
20003   0.1005  0.0000  0.1005  0.0000  ~:0(<built-in method posix.stat>)
10002   0.0145  0.0000  0.0799  0.0000  disk-objectstore/disk_objectstore/container.py:226(_get_pack_path_from_pack_id)

Documention notes:

Current experience (with AiiDA) shows that it's actually not so good to use two levels of nesting.

out of interest, why?

the move operation should be hopefully a fast atomic operation on most filesystems

are there any you know of where this is not the case?

add tox.ini:

[tox]
envlist = py37

[testenv]
usedevelop=True

[testenv:py{35,36,37,38}]
description = run pytest
extras = dev
commands = pytest {posargs:--cov=disk_objectstore}

[testenv:py{36,37,38}-pre-commit]
description = run pre-commit
extras = dev
commands = pre-commit {posargs:run}

[testenv:py{35,36,37,38}-ipython]
description = start an ipython console
deps =
    ipython
commands = ipython

Profiling snapshot (for later discussion):

$ tox -e py37 -- --benchmark-only --benchmark-cprofile=cumtime

tests/test_benchmark.py::test_has_objects
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0048  0.0048  0.4209  0.4209  disk-objectstore/disk_objectstore/container.py:778(has_objects)
10001   0.0322  0.0000  0.4123  0.0000  disk-objectstore/disk_objectstore/container.py:451(_get_objects_stream_meta_generator)
6       0.0000  0.0000  0.0896  0.0149  disk-objectstore/.tox/py37/lib/python3.7/site-packages/sqlalchemy/orm/query.py:3503(__iter__)
6       0.0000  0.0000  0.0886  0.0148  disk-objectstore/.tox/py37/lib/python3.7/site-packages/sqlalchemy/orm/query.py:3528(_execute_and_instances)
8       0.0000  0.0000  0.0874  0.0109  disk-objectstore/.tox/py37/lib/python3.7/site-packages/sqlalchemy/engine/base.py:943(execute)
7       0.0000  0.0000  0.0872  0.0125  disk-objectstore/.tox/py37/lib/python3.7/site-packages/sqlalchemy/sql/elements.py:296(_execute_on_connection)
7       0.0001  0.0000  0.0872  0.0125  disk-objectstore/.tox/py37/lib/python3.7/site-packages

tests/test_benchmark.py::test_list_all_loose
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0000  0.0000  0.0000  0.0000  ~:0(<method 'disable' of '_lsprof.Profiler' objects>)

tests/test_benchmark.py::test_list_all_packed
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0000  0.0000  0.0000  0.0000  ~:0(<method 'disable' of '_lsprof.Profiler' objects>)

tests/test_benchmark.py::test_loose_read
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0019  0.0019  0.1003  0.1003  disk-objectstore/disk_objectstore/container.py:803(get_objects_content)
1001    0.0046  0.0000  0.0927  0.0001  disk-objectstore/disk_objectstore/container.py:451(_get_objects_stream_meta_generator)
1000    0.0321  0.0000  0.0321  0.0000  ~:0(<built-in method io.open>)

tests/test_benchmark.py::test_pack_read
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0089  0.0089  0.1700  0.1700  disk-objectstore/disk_objectstore/container.py:803(get_objects_content)
10001   0.0300  0.0000  0.1407  0.0000  disk-objectstore/disk_objectstore/container.py:451(_get_objects_stream_meta_generator)
10001   0.0146  0.0000  0.0677  0.0000  disk-objectstore/disk_objectstore/utils.py:922(detect_where_sorted)
20004   0.0051  0.0000  0.0498  0.0000  ~:0(<built-in method builtins.next>)
10001   0.0030  0.0000  0.0447  0.0000  disk-objectstore/.tox/py37/lib/python3.7/site-packages/sqlalchemy/engine/result.py:1006(__iter__)
10001   0.0071  0.0000  0.0417  0.0000  disk-objectstore/.tox/py37/lib/python3.7/site-packages/sqlalchemy/engine/result.py:1320(fetchone)
20000   0.0099  0.0000  0.0253  0.0000  disk-objectstore/disk_objectstore/utils.py:385(_update_pos)
10000   0.0093  0.0000  0.0245  0.0000  disk-objectstore/disk_objectstore/utils.py:365(__init__)
10001   0.0028  0.0000  0.0237  0.0000  disk-objectstore/.tox/py37/lib/python3.7/site-packages/sqlalchemy/engine/result.py:1213(_fetchone_impl)
10001   0.0209  0.0000  0.0209  0.0000  ~:0(<method 'fetchone' of 'sqlite3.Cursor' objects>)
10000   0.0059  0.0000  0.0204  0.0000  disk-objectstore/disk_objectstore/utils.py:400(read)
20000   0.0154  0.0000  0.0154  0.0000  ~:0(<method 'tell' of '_io.BufferedReader' objects>)
10000   0.0080  0.0000  0.0109  0.0000  disk-objectstore/.tox/py37/lib/python3.7/site-packages

tests/test_benchmark.py::test_loose_write
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0018  0.0018  0.5730  0.5730  disk-objectstore/tests/test_benchmark.py:28(write_loose)
1000    0.0039  0.0000  0.5710  0.0006  disk-objectstore/disk_objectstore/container.py:842(add_object)
1000    0.0051  0.0000  0.5671  0.0006  disk-objectstore/disk_objectstore/container.py:851(add_streamed_object)
1000    0.0150  0.0000  0.4302  0.0004  disk-objectstore/disk_objectstore/utils.py:158(__exit__)
1000    0.0079  0.0000  0.2387  0.0002  disk-objectstore/disk_objectstore/utils.py:852(safe_flush_to_disk)
2000    0.0052  0.0000  0.1639  0.0001  disk-objectstore/disk_objectstore/utils.py:870(<lambda>)
2000    0.1587  0.0001  0.1587  0.0001  ~:0(<built-in method posix.fsync>)
1000    0.0051  0.0000  0.1083  0.0001  disk-objectstore/disk_objectstore/utils.py:141(__enter__)
2000    0.1044  0.0001  0.1044  0.0001  ~:0(<built-in method io.open>)

tests/test_benchmark.py::test_pack_write
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0000  0.0000  0.4961  0.4961  disk-objectstore/disk_objectstore/container.py:1556(add_objects_to_pack)
1       0.0522  0.0522  0.4945  0.4945  disk-objectstore/disk_objectstore/container.py:1297(add_streamed_objects_to_pack)
10001   0.0193  0.0000  0.2252  0.0000  disk-objectstore/disk_objectstore/container.py:242(_get_pack_id_to_write_to)
20003   0.1072  0.0000  0.1072  0.0000  ~:0(<built-in method posix.stat>)
10002   0.0150  0.0000  0.0848  0.0000  disk-objectstore/disk_objectstore/container.py:226(_get_pack_path_from_pack_id)

Originally posted by @chrisjsewell in https://github.com/aiidateam/disk-objectstore/pull/96#issuecomment-700491481

giovannipizzi commented 1 year ago

It feels like _get_objects_stream_meta_generator could be refactored some what to remove the duplication(pytest is right, it is too complex lol). Perhaps the number of retries for loose_not_found should be configurable.

It is. But the number of retries is not random - its current number is designed to guarantee it works correctly when there is a concurrent pack_all_loose and clean_storage.

Current experience (with AiiDA) shows that it's actually not so good to use two levels of nesting. out of interest, why?

The issue is the number of inodes. Every folder is an inode, and on "typical" filesystems (e.g. ext3, ext4) it costs 4kB or so. Let's consider two levels of nesting: you have 256 folders per level, so a total of 256*256 = 65536 folders).

You indeed have on average less files per sub-sub-folder (e.g. if you have 1 million objects, you have on average ~4 files per folder. With a single nesting level, you have ~4000. But in the end, having 4000 files in a folder is not so big of a problem! No need to only keep 3-4 files per subfolder. And with disk-objectstore, the idea is that you should have probably already packed your objects if you have 1000000 objects and you should not keep them loose. So, no need to do two levels of nesting. (reading the comment now, I realise that I should not have said that it's not so good, probably it's just not needed).

the move operation should be hopefully a fast atomic operation on most filesystems are there any you know of where this is not the case?

I didn't check but I'm sure that for non-standard filesystems (especially network shares or similar, or even worse when mounting things that are not supposed to be a filesystem, such as mounting an S3 share) this is not the case. Probably good network filesystems give some good guarantees on this (but I'm not sure if they also provide instantaneous consistency between different network machines), but I'm quite sure that it's not even possible if you mount an S3, since there is no concept (IIRC) of "moving" an object in S3, you have to create a new one and delete the old one.

Corner cases, but we already know of people that would like to use this on top of e.g. S3 (we even had an issue to discuss this, #17) so it's important to specify in my opinion