Open jcea opened 11 years ago
Currently, "mmap.flush()" does a synchronous write to the backend file. The call will wait until data is actually flushed to disk, because internally it is doing a "msync(MS_SYNC)".
But the value of "mmap.flush()" is to synchronize file and memory. You don't need a synchronous write in the general case.
I propose to add an optional keyword parameter with default value "SYNC" (compatibility) but that can be "ASYNC", "INVALIDATE" (can be "SYNC|INVALIDATE" and "ASYNC|INVALIDATE" too).
I am talking about UNIX MMAP. No idea about Windows.
Check "man msync" for useful cases.
I propose to add an optional keyword parameter with default value "SYNC" (compatibility) but that can be "ASYNC", "INVALIDATE" (can be "SYNC|INVALIDATE" and "ASYNC|INVALIDATE" too).
AFAICT it's mostly useless on a modern OS. MS_INVALIDATE is a no-op on systems with merged VM-buffer cache, i.e. it's not needed for mmap() to reflect write() and vice-versa.
So nothing's normally needed to "synchronize file and memory".
As for MS_ASYNC, it actually doesn't do anything at all on recent OS, e.g. it's a no-op on Linux since a couple years, since modified pages will be written back as part of the normal writeback process.
The only thing a user might actually need for an mmap object is to make sure data is actually committed to disk, and MS_SYNC covers this.
See e.g. this post by Andrew Morton: http://thread.gmane.org/gmane.linux.kernel/1312660
Depending of a concrete OS implementation is not good. Linux is not the only OS out there, and I have very old machines in production yet:
""" # uname -a Linux colquide.XXXX.es 2.4.37 #4 Fri Dec 12 01:10:45 CET 2008 i686 unknown """
I have been hit by the VM/file cache split in the past. Portability is important.
Anyway, the Python "mmap" manual says that "mmap.flush()" is needed to be sure that you are not going to "lose" changes you made in the mmap. On "modern" OSs it is not actually needed, as you say, and the performance hit is important enough for me to investigate and write this enhancement proposal :).
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['extension-modules', 'easy', 'type-feature']
title = '"mmap.flush()" is always synchronous, hurting performance'
updated_at =
user = 'https://github.com/jcea'
```
bugs.python.org fields:
```python
activity =
actor = 'josh.r'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Extension Modules']
creation =
creator = 'jcea'
dependencies = []
files = []
hgrepos = []
issue_num = 18816
keywords = ['easy']
message_count = 3.0
messages = ['195941', '195948', '195971']
nosy_count = 3.0
nosy_names = ['jcea', 'neologix', 'josh.r']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue18816'
versions = ['Python 3.4']
```