python / cpython

The Python programming language
https://www.python.org
Other
63.51k stars 30.42k forks source link

release GIL while doing I/O operations in the mmap module #44098

Closed 052306b5-29e9-4fe2-91f8-d1180c7be5a4 closed 1 year ago

052306b5-29e9-4fe2-91f8-d1180c7be5a4 commented 18 years ago
BPO 1572968
Nosy @pitrou, @briancurtin, @jnwatson, @ZackerySpytz, @isidentical
PRs
  • python/cpython#14114
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['extension-modules', '3.9', 'performance'] title = 'release GIL while doing I/O operations in the mmap module' updated_at = user = 'https://bugs.python.org/luks' ``` bugs.python.org fields: ```python activity = actor = 'ZackerySpytz' assignee = 'none' closed = False closed_date = None closer = None components = ['Extension Modules'] creation = creator = 'luks' dependencies = [] files = [] hgrepos = [] issue_num = 1572968 keywords = ['patch'] message_count = 8.0 messages = ['61261', '61262', '84542', '102453', '102465', '102466', '342561', '343989'] nosy_count = 8.0 nosy_names = ['pitrou', 'luks', 'brian.curtin', 'neologix', 'adiroiban', 'jnwatson', 'ZackerySpytz', 'BTaskaya'] pr_nums = ['14114'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'performance' url = 'https://bugs.python.org/issue1572968' versions = ['Python 3.9'] ```

    052306b5-29e9-4fe2-91f8-d1180c7be5a4 commented 18 years ago

    There is a few system I/O calls in the mmap module that can take some time. Would be be possible to put Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS around these to release the GIL and allow other Python threads to do their work?

    ef6b2a61-f027-4805-a66a-cde4eee277c3 commented 18 years ago

    Logged In: YES user_id=341410

    Can you come up with the listing of code blocks where it would make sense to allow threads, or even provide a patch?

    pitrou commented 15 years ago

    I don't know mmap myself but patches are welcome.

    79528080-9d85-4d18-8a2a-8b1f07640dd7 commented 14 years ago

    As soon as you're dealing with files (not anonymous mapping), you can get the same type of latency than when using open/read/write... While it's probably not worth the trouble to release the GIL for every operation involving mmaped-files, there are a couple places where it should be considered, for example:

    Modules/mmapmodule.c:new_mmap_object
    #ifdef HAVE_FSTAT
    #  ifdef __VMS
        /* on OpenVMS we must ensure that all bytes are written to the file */
        if (fd != -1) {
                fsync(fd);
        }
    #  endif

    fsync() can take up to a couple seconds, so we should probably allow threads here (this is done only on VMS). Another place worth looking at is when using msync(), like in mmap_object_dealloc():

        if (m_obj->data!=NULL) {
            msync(m_obj->data, m_obj->size, MS_SYNC);
            munmap(m_obj->data, m_obj->size);
        }

    or in mmap_flush_method():

        if (-1 == msync(self->data + offset, size, MS_SYNC)) {
            PyErr_SetFromErrno(mmap_module_error);
            return NULL;
        }

    msycn too can take quite a long time to complete.

    I can write a patch if someone's willing to review it.

    pitrou commented 14 years ago

    This shouldn't really care about VMS here. As for the rest, feel free to propose a patch :) Please note that most non-trivial system calls, such as ftruncate(), mremap(), even mmap() itself, deserve to be enclosed in Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS pairs.

    However, you'll have to be careful that the internal state of the mmap object remains consistent, such that using it from several threads doesn't crash the interpreter. (you can even add multi-threaded unit tests if you want to be sure of this)

    pitrou commented 14 years ago

    This shouldn't really care about VMS here.

    Correction: /we/ shouldn't really care about VMS here. And the reason being, of course, that we have neither developers nor known testers under VMS. (we don't even know if the current trunk builds there or not)

    isidentical commented 5 years ago

    Any news? If a patch is not ready, i can work on a patch too.

    fcdc96d4-77c1-4326-9a94-a0ac9b5246bb commented 5 years ago

    I'll add one more system I/O call that's not GIL-wrapped in the mmap module that can take some time: mmap itself.

    mmap on Linux with MAP_POPULATE (0x8000) as the flags can take quite a bit of time. That's the flag that prefaults the memory range. MAP_POPULATE has been around since Linux 2.5.46.

    I know that MAP_POPULATE isn't explicitly supported in the module, but mmap.mmap does take arbitrary flags, so it isn't exactly unsupported either.

    gvanrossum commented 2 years ago

    This seems stuck, and it’s really old (2006!).

    @isidentical Could I interest you in providing that patch you mentioned?

    gvanrossum commented 2 years ago

    Once the Unix PR lands, should we keep this open hoping someone will do it for Windows, or close it?

    hauntsaninja commented 1 year ago

    The Unix PR has landed and I don't intend to make one for Windows (don't have access to a machine or a use case). This has been open since 2006, so I vote we close it.