python / cpython

The Python programming language
https://www.python.org
Other
63.09k stars 30.22k forks source link

Shelve consistency issues #69628

Open 8d06c1c0-4d22-4786-a6a1-d79c9e97c71a opened 9 years ago

8d06c1c0-4d22-4786-a6a1-d79c9e97c71a commented 9 years ago
BPO 25442
Nosy @bitdancer

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-feature', 'docs'] title = 'Shelve consistency issues' updated_at = user = 'https://bugs.python.org/YanyanJiang' ``` bugs.python.org fields: ```python activity = actor = 'r.david.murray' assignee = 'docs@python' closed = False closed_date = None closer = None components = ['Documentation'] creation = creator = 'Yanyan Jiang' dependencies = [] files = [] hgrepos = [] issue_num = 25442 keywords = [] message_count = 4.0 messages = ['253188', '253195', '253200', '253215'] nosy_count = 3.0 nosy_names = ['r.david.murray', 'docs@python', 'Yanyan Jiang'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue25442' versions = ['Python 2.7', 'Python 3.2', 'Python 3.3', 'Python 3.4', 'Python 3.5', 'Python 3.6'] ```

8d06c1c0-4d22-4786-a6a1-d79c9e97c71a commented 9 years ago

I am currently working on the file system reliability issues. I have a disk driver that is able to simulate crash disk sites after injected power failures. This disk is totally compatible with the Linux block driver semantics (refer to https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt), and may create many crash sites that pending blocks are partially flushed into the disk which is a common behavior of a commodity disk with write buffer.

Our automated tool confirms the corruptions could happen on a crash site at an unclean shutdown (Linux with default ext4 setting). We also found that there are some discussions on Stackoverflow concerning this issue. I am suggesting to explicitly remind the developers of such behaviors.

Suggested documentation enhancement -------------------------------------- As a minimal database library, shelve does not offer as strong ACID (atomicity, consistency, isolation and durability) guarantee as a database (like SQLite). On certain system configurations, a system crash would lead to a corrupted shelve file. If you are using shelve to persistent precious data like user's document, we suggest using the following steps to ensure data is not lost:

  1. Create a copy of the file, say, the temporary.
  2. Operate on a copy of the temporary file. Closing a shelve db implies data to be flushed to the disk.
  3. Rename the temporary file to replace the original file. Renaming is carefully treated by a journaled filesystem to be atomic.
bitdancer commented 9 years ago

Shelve does not itself implement any database, but it does *use a database[]. Any aspects of this must be directed toward the underlying database library used. In particular, it is not part of the shelve API to know anything about any possible underlying file or files, nor is it *necessarily* the case that there is pending data to be flushed on close.

So, if you want to suggest a documentation enhancement, it should to make reference to the issue and point the user at the documentation for the underlying database they choose to use for more information.

[*] There is an open issue proposing an sqlite backend for shelve, but no one so far has had the motivation to finish it.

8d06c1c0-4d22-4786-a6a1-d79c9e97c71a commented 9 years ago

Thanks for reminding. It is originally reported with the default setting. We conducted further tests with other options of anydbm (dbhash, dbm, gdbm), none of them survived crash testing. For the detailed reasoning please refer to an OSDI'14 research paper: https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf This paper discussed vulnerabilities of GDBM implementation in that paper, and these lightweight db implementations have similar problems. We also have tests SQLite, and it is much more robust that we have not found ACID violation yet.

Personally I think it is reasonable to have an SQLite backend, as it is much safer (plus providing thread safety). Just to see what I can do for that.

bitdancer commented 9 years ago

Yeah, if we had an sqlite backend I think we'd make it the default if sqlite was available. There's a proof of concept implementation in the open bpo-3783. I'm not sure what remains to be done (other than docs)...I didn't read through the issue and there's a fair bit of discussion.