python / cpython

The Python programming language
https://www.python.org
Other
63.67k stars 30.5k forks source link

shared memory construct to avoid need for serialization between processes #79994

Open applio opened 5 years ago

applio commented 5 years ago
BPO 35813
Nosy @brettcannon, @nascheme, @rhettinger, @terryjreedy, @ronaldoussoren, @pitrou, @osvenskan, @giampaolo, @skrah, @pmp-p, @ambv, @ericsnowcurrently, @1st1, @applio
PRs
  • python/cpython#11664
  • python/cpython#11664
  • python/cpython#11664
  • python/cpython#11816
  • python/cpython#11816
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/applio' closed_at = None created_at = labels = ['3.8', 'type-feature', 'library'] title = 'shared memory construct to avoid need for serialization between processes' updated_at = user = 'https://github.com/applio' ``` bugs.python.org fields: ```python activity = actor = 'xdegaye' assignee = 'davin' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'davin' dependencies = [] files = [] hgrepos = [] issue_num = 35813 keywords = ['patch', 'patch', 'patch'] message_count = 57.0 messages = ['334278', '334737', '334789', '334797', '334798', '334799', '334805', '334806', '334809', '334817', '334820', '334821', '334822', '334835', '334934', '334945', '334957', '334963', '335030', '335032', '335194', '335195', '335262', '335273', '335282', '335284', '335360', '335362', '335366', '335622', '335660', '335682', '335689', '335694', '335696', '335697', '335700', '335711', '335715', '335719', '335731', '335789', '335797', '336182', '336294', '336298', '336336', '336383', '336386', '336388', '336395', '336400', '336423', '336429', '336430', '338180', '341920'] nosy_count = 15.0 nosy_names = ['brett.cannon', 'nascheme', 'rhettinger', 'terry.reedy', 'ronaldoussoren', 'pitrou', 'osvenskan', 'giampaolo.rodola', 'skrah', 'pmpp', 'lukasz.langa', 'eric.snow', 'Pam.McANulty', 'yselivanov', 'davin'] pr_nums = ['11664', '11664', '11664', '11816', '11816'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue35813' versions = ['Python 3.8'] ```

    applio commented 5 years ago

    A facility for using shared memory would permit direct, zero-copy access to data across distinct processes (especially when created via multiprocessing) without the need for serialization, thus eliminating the primary performance bottleneck in the most common use cases for multiprocessing.

    Currently, multiprocessing communicates data from one process to another by first serializing it (by default via pickle) on the sender's end then de-serializing it on the receiver's end. Because distinct processes possess their own process memory space, no data in memory is common across processes and thus any information to be shared must be communicated over a socket/pipe/other mechanism. Serialization via tools like pickle is convenient especially when supporting processes on physically distinct hardware with potentially different architectures (which multiprocessing does also support). Such serialization is wasteful and potentially unnecessary when multiple multiprocessing.Process instances are running on the same machine. The cost of this serialization is believed to be a non-trivial drag on performance when using multiprocessing on multi-core and/or SMP machines.

    While not a new concept (System V Shared Memory has been around for quite some time), the proliferation of support for shared memory segments on modern operating systems (Windows, Linux, *BSDs, and more) provides a means for exposing a consistent interface and api to a shared memory construct usable across platforms despite technical differences in the underlying implementation details of POSIX shared memory versus Native Shared Memory (Windows).

    For further reading/reference: Tools such as the posix_ipc module have provided fairly mature apis around POSIX shared memory and seen use in other projects. The "shared-array", "shared_ndarray", and "sharedmem-numpy" packages all have interesting implementations for exposing NumPy arrays via shared memory segments. PostgreSQL has a consistent internal API for offering shared memory across Windows/Unix platforms based on System V, enabling use on NetBSD/OpenBSD before those platforms supported POSIX shared memory.

    At least initially, objects which support the buffer protocol can be most readily shared across processes via shared memory. From a design standpoint, the use of a Manager instance is likely recommended to enforce access rules in different processes via proxy objects as well as cleanup of shared memory segments once an object is no longer referenced. The documentation around multiprocessing's existing sharedctypes submodule (which uses a single memory segment through the heap submodule with its own memory management implementation to "malloc" space for allowed ctypes and then "free" that space when no longer used, recycling it for use again from the shared memory segment) will need to be updated to avoid confusion over concepts.

    Ultimately, the primary motivation is to provide a path for better parallel execution performance by eliminating the need to transmit data between distinct processes on a single system (not for use in distributed memory architectures). Secondary use cases have been suggested including a means for sharing data across concurrent Python interactive shells, potential use with subinterpreters, and other traditional uses for shared memory since the first introduction of System V Shared Memory onwards.

    applio commented 5 years ago

    New changeset e5ef45b8f519a9be9965590e1a0a587ff584c180 by Davin Potts in branch 'master': bpo-35813: Added shared_memory submodule of multiprocessing. (bpo-11664) https://github.com/python/cpython/commit/e5ef45b8f519a9be9965590e1a0a587ff584c180

    ronaldoussoren commented 5 years ago

    Was this merged too soon? This is new functionality without any docs and tests.

    I've also left review comments on the pull request.

    pitrou commented 5 years ago

    Yes, I think this was merged too soon. Davin, the convention is that we only merge completed work, not some work-in-progress.

    You already did this once in https://bugs.python.org/issue28053 . You still haven't completed that piece of work.

    I'm going to ask for this to be reverted.

    ronaldoussoren commented 5 years ago

    FWIW I agree with reverting this pull request, the work is clearly not finished yet.

    pitrou commented 5 years ago

    And to make things clear, I'm not saying anything about the functionality. Easier shared memory + multiprocessing is definitely an interesting endeavour (and a non-trivial one).

    terryjreedy commented 5 years ago

    I think there is also a license problem. posixshmem.c contains "Copyright 2012 Philip Semanchuk, 2018-2019 Davin Potts"

    Ronald commented "The only other files with a copyright attribute are parser, optparse and platform. I'd prefer to avoid adding new copyright attributes to stdlib modules."

    (turtle.py has a problematic copyright notice in the docstring.)

    I think we definitely should not add new copyright notices.

    Copyright notices are redundant: all contributors retain copyright in their contributions.

    Copyright notices are deceptive: 1. since they are rare, they imply that there is something special about a particular module and the listed authors; 2. even if a module were special, the notice becomes obsolete as soon as anyone else contributes to the module.

    Copyright notices are not needed: contributors authorize PSF to distribute the collective work under a rather liberal license. If anyone want to make a use of Python code not covered by that license, and wants to bypass PSF, they would have to look at git log and and git blame to find the relevant contributors.

    In this case, part of the work is attributed to Philip Semanchuk as a current copyright owner. According to https://bugs.python.org/user2567, he has not signed the contributor agreement, so his work should not have been merged until he has. Even if he had, he would have to specifically agree to his work being contributed. Sorry to be a grinch, but aside from anything else, I think this should be reverted until the legal question is clarified.

    Some of this might need discussion on pydev.

    applio commented 5 years ago

    This work is the result of \~1.5 years of development effort, much of it accomplished at the last two core dev sprints. The code behind it has been stable since September 2018 and tested as an independently installable package by multiple people.

    I was encouraged by Lukasz, Yury, and others to check in this code early, not waiting for tests and docs, in order to both solicit more feedback and provide for broader testing. I understand that doing such a thing is not at all a novelty. Thankfully it is doing that -- I hope that feedback remains constructive and supportive.

    There are some tests to be found in a branch (enh-tests-shmem) of github.com/applio/cpython which I think should become more comprehensive before inclusion. Temporarily deferring and not including them as part of the first alpha should reduce the complexity of that release.

    Regarding the BSD license on the C code being adopted, my conversations with Brett and subsequently Van have not raised concerns, far from it -- there is a process which is being followed to the letter. If there are other reasons to object to the thoughtful adoption of code licensed like this one, that deserves a decoupled and larger discussion first.

    ronaldoussoren commented 5 years ago

    David,

    I don't agree with merging at this point.

    The lack of documentation makes it hard to ask for more feedback and broader testing, because people won't know how to use the library and provide constructive feedback (that's why my review comments on GitHub are mostly nitpicking).

    Furthermore, distributing this as a package on PyPI is much better way to solicit further testing and feedback, because then it can be used right now instead of having to wait for a release. Releasing on PyPI is feasible for this new functionality because it is a standalone library.

    It would be better to revert this changeset until it is much closer to finished and there has been time for better code review.

    BTW. My comment about copyright on the pull request was about the __copyright__ attribute of the C extension, that is rarely used in the stdlib and is IMHO something to avoid.

    ambv commented 5 years ago

    @Davin, in what time can you fill in the missing tests and documentation? If this is something you finish do before alpha2, I'm inclined to leave the change in.

    As it stands, I missed the controversy yesterday as I was busy making my first release. So the merge *got released* in alpha1. I would prefer to fix the missing pieces forward instead of reverting and re-submitting which will only thrash blame and history at this point.

    FTR, I do agree with Antoine, Ronald and others that in the future such big changes should be as close to their ready state at merge time.

    applio commented 5 years ago

    @lukasz.langa: Missing tests and documentation will be in by alpha2.

    ronaldoussoren commented 5 years ago

    Please also address my review comments.

    pitrou commented 5 years ago

    Also, I will want to review this as well. Since Alpha 2 is scheduled for 2019-02-24, and this is a sizable piece of work, I would prefer if a full PR was available, say, before 2019-02-15.

    ericsnowcurrently commented 5 years ago

    FTR, I've had a number of extensive conversations with Davin (over the last year and a half) about this feature. His choices seemed reasonable (caveat: I'm not an expert on multiprocessing) and the feature seemed desirable. I did not review his PR.

    3da0707d-f262-40c9-bcc7-95cefbde6746 commented 5 years ago

    Hi all, I'm the author of posix_ipc on which some of this code is based. I'd be happy to sign a contributor agreement in order to erase any concerns on that front.

    terryjreedy commented 5 years ago

    I would prefer that we be consistent. In any case, I think you should be added to Misc/ACKS in the PR.

    giampaolo commented 5 years ago

    Unit-tests at https://bugs.python.org/issue35917.

    giampaolo commented 5 years ago

    Also, for completeness (since discussion is getting split), please see my proposal to move SharedMemoryManager and SharedMemoryServer into multiprocessing.managers namespace and rename shared_memory.py to _shared_memory.py: https://mail.python.org/pipermail/python-dev/2019-February/156235.html Also, it appears ShareableList should be register()ed against SharedMemoryManager.

    nascheme commented 5 years ago

    I didn't finish reviewing completely yet but here are some comments.

    I think the random filename generation can be pulled out of _posixshmem. Make it require that a filename is passed into it. That moves a fair bit of complexity out of C and into Python. In the Python module, I suggest that you could use secrets.token_hex() to build the filename. Put the loop that retries because of name collisions in the Python module as well.

    If you like, I can try to make a patch that does the above.

    When looking at at how the Python code would handle a name collision, I see this code:

    + switch (errno) { + case EACCES: + PyErr_Format(pPermissionsException, + "No permission to %s this segment", + (flags & O_TRUNC) ? "truncate" : "access" + ); + break; + + case EEXIST: + PyErr_SetString(pExistentialException, + "Shared memory with the specified name already exists"); + break; + + case ENOENT: + PyErr_SetString(pExistentialException, + "No shared memory exists with the specified name"); + break; + + case EINVAL: + PyErr_SetString(PyExc_ValueError, "Invalid parameter(s)"); + break; + + case EMFILE: + PyErr_SetString(PyExc_OSError, + "This process already has the maximum number of files open"); + break; + + case ENFILE: + PyErr_SetString(PyExc_OSError, + "The system limit on the total number of open files has been reached"); + break; + + case ENAMETOOLONG: + PyErr_SetString(PyExc_ValueError, + "The name is too long"); + break; + + default: + PyErr_SetFromErrno(PyExc_OSError); + break; + }

    I think it might be cleaner just to make it:

            PyErr_SetFromErrno(PyExc_OSError);

    Then, if you need a customized exception or message, you can re-raise inside the Python code. To me, it seems simpler and more direct to just preserve the errno and always raise OSError. Changing things to ValueError means that callers need to look at the message text to differentiate between some errno values.

    Is it the case that _posixshmem started life as a module that would be used directly and not something hidden by another layer of abstraction? If so, having these customized exceptions and having it do the filename generation itself makes sense. However, it is an internal implementation detail of shared_memory.py, I think it should be simplified to do only what is needed. E.g. a thin layer of the system calls.

    pitrou commented 5 years ago

    (also, OSError will automatically convert to more specific subclasses for some errnos, see PEP-3151:

    >>> raise OSError(errno.ENOENT, "foobar")                                                                                                                              
    Traceback (most recent call last):
      File "<ipython-input-3-e08f9c9a179c>", line 1, in <module>
        raise OSError(errno.ENOENT, "foobar")
    FileNotFoundError: [Errno 2] foobar

    )

    applio commented 5 years ago

    Docs and tests are now available in a new PR. I have stayed focused on getting these docs and tests to everyone without delay but that means I have not yet had an opportunity to respond to the helpful comments, thoughtful questions, and threads that have popped up in the last few days. I will follow up with all comments as quickly as possible starting in the morning.

    There are two topics in particular that I hope will trigger a wider discussion: the api around the SharedMemory class and the inclusion-worthiness of the shareable_wrap function.

    Regarding the api of SharedMemory, the docs explain that not all of the current input parameters are supportable/enforceable across platforms. I believe we want an api that is relevant across all platforms but at the same time we do not want to unnecessarily suppress/hide functionality that would be useful on some platforms -- there needs to be a balance between these motivations but where do we strike that balance?

    Regarding the inclusion-worthiness of the shareable_wrap function, I deliberately did not include it in the docs but its docstring in the code explains its purpose. If included, it would drastically simplify working with NumPy arrays; please see the code example in the docs demonstrating the use of NumPy arrays without the aid of the shareable_wrap function. I have received feedback from others using this function also worth discussing.

    Thank you to everyone who has already looked at the code and shared helpful thoughts -- please have a look at the tests and docs.

    applio commented 5 years ago

    @giampaolo.rodola: Your patch from 3 days ago in bpo-35917 included additional tests around the SharedMemoryManager which are now causing test failures in my new PR. This is my fault because I altered SharedMemoryManager to no longer support functionality from SyncManager that I thought could be confusing to include. I am just now discovering this and am not immediately sure if simply removing the SharedMemoryManager-relevant lines from your patch is the right solution but I wanted to mention this thought right away.

    Thank you for discovering that SyncManager was being overlooked in the tests and the nice patch in bpo-35917.

    applio commented 5 years ago

    @terry.reedy and @ronaldoussoren: I have asked Van again to provide comments here clarifying the topics of (1) copyright notices and (2) requiring the BSD-licensed-work's author to sign a contributor agreement.

    Specifically regarding the appearance of __copyright__, I added my agreement to your comments on python/cpython#56025 on this.

    giampaolo commented 5 years ago

    I submitted an initial review / comments in the PR. I think this is good for a first iteration in order to understand what APIs to expose publicly (also, keep in mind I may not have a full picture of how this is intended to be used precisely).

    As for the doc, I find it a bit too verbose. If the namespace change is accepted consider doing the following:

    Just an idea. Writing good doc is not easy and it requires actually putting hands on it.

    Hope this helps.

    rhettinger commented 5 years ago

    FWIW, I read the draft docs and found them to be at the right level throughout. I definitely wouldn't want anything more terse.

    applio commented 5 years ago

    @giampaolo.rodola: It definitely helps.

    Conceptually, SyncManager provides "distributed shared memory" where lists, dicts, etc. are held in memory by one process but may be accessed remotely from another via a Proxy Object. Mutating a dict from one process requires sending a message to some other process to request the change be made.

    In contrast, SharedMemoryManager provides non-distributed shared memory where a special region of memory is held by the OS kernel (not a process) and made directly addressable to many processes simultaneously. Modifying any data in this special region of memory requires zero process-to-process communication; any of the processes may modify the data directly.

    In a speed contest, the SharedMemoryManager wins in every use case -- and it is not a close race. There are other advantages and disadvantages to each, but speed is the key differentiator.

    Thinking ahead to the future of SharedMemoryManager, there is the potential for a POSIX shared memory based semaphore. The performance of this semaphore across processes should drastically outperform SyncManager's semaphore. It might be something we will want to support in the future. SharedMemoryManager needs a synchronization mechanism now (in support of common use cases) to coordinate across processes, which is why I initially thought SharedMemoryManager should expose the Lock, Semaphore, Event, Barrier, etc. powered by distributed shared memory. I am no longer sure this is the right choice for three reasons: (1) it unnecessarily complicates and confuses the separation of what is powered by fast SystemV-style shared memory and what is powered by slow distributed shared memory, (2) it would be a very simple example in the docs to show how to add our existing Lock or Semaphore to SharedMemoryManager via register(), (3) if we one day implement POSIX shared memory semaphores (and equivalent where POSIX is not supported), we will have the burden of an existing lock/semaphore creation methods and apis with behavioral differences.

    I propose that it would be clearer but no less usable if we drop these registered object types (created via calls to register()) from SharedMemoryManager. It is one line of code for a user to add "Lock" to SharedMemoryManager, which I think we can demonstrate well with a simple example.

    pitrou commented 5 years ago

    Davin:

    This is my fault because I altered SharedMemoryManager to no longer support functionality from SyncManager that I thought could be confusing to include. I am just now discovering this and am not immediately sure if simply removing the SharedMemoryManager-relevant lines from your patch is the right solution but I wanted to mention this thought right away.

    If SharedMemoryManager subclasses SyncManager then I *think* it should obey the SyncManager contract. Regardless of the shared memory facility, the user may also want to "shared" regular proxies between processes.

    (to be honest, I don't think the multiprocessing Manager facility is used a lot currently...)

    applio commented 5 years ago

    @Antoine: SharedMemoryManager does not subclass SyncManager but it did previously. This is the source of the confusion. SharedMemoryManager subclasses BaseManager which does not provide Value, Array, list, dict, etc.

    Agreed that the manager facility does not appear to see that much use in existing code.

    When working with shared memory, I expect SharedMemoryManager to be much more popular.

    pitrou commented 5 years ago

    SharedMemoryManager does not subclass SyncManager but it did previously.

    Ah, right. Then it's ok to change the tests IMO.

    applio commented 5 years ago

    These questions (originally asked in comments on python/cpython#56025) seemed more appropriate to discuss here: Why should the user want to use SharedMemory directly? Why not just go through the manager? Also, perhaps a naive question: don't you always need a start()ed manager in order for the processes to communicate? Doesn't SharedMemoryServer has to be involved?

    I think it helps to discuss the last question first. A SharedMemoryManager is *not* needed for two processes to share information across a shared memory block, nor is a SharedMemoryServer required. The docs have examples demonstrating this but here is another meant to showcase exactly this:

        Start up a Python shell and do the following:
            >>> from multiprocessing import shared_memory
            >>> shm = shared_memory.SharedMemory(name=None, size=10)
            >>> shm.buf[:5] = b'Feb15'
            >>> shm.name  # Note this name and use it in the next steps
            'psm_26792_26631'
    
        Start up a second Python shell in a new window and do the following:
            >>> from multiprocessing import shared_memory
            >>> also_shm = shared_memory.SharedMemory(name='psm_26792_26631')  # Use that same name
            >>> bytes(also_shm.buf[:5])
            b'Feb15'
    If also_shm.buf is further modified in the second shell, those
    changes will be visible on shm.buf in the first shell.  The same
    is true of the reverse.

    The key point is that there is no sending of messages between the processes at all. In stark contrast, SyncManager offers and supports objects held in "distributed shared memory" where messages must be sent from one process to another to access or manipulate data; those objects held in "distributed shared memory" *must* have a SyncManager+Server to enable their use. That is not needed at all for SharedMemory because access to and manipulation of the data is performed directly without the cost-delay of messaging.

    This begs a new question, "so what is the SharedMemoryManager used for then?" The docs answer: To assist with the life-cycle management of shared memory especially across distinct processes, a BaseManager subclass, SharedMemoryManager, is also provided. Because shared memory blocks are not "owned" by a single process, they are not destroyed/freed when a process exits. A SharedMemoryManager is used to ensure the free-ing of a shared memory block when it is no longer needed.

    New SharedMemory instances may be created via a SharedMemoryManager (in which case their birth-to-death life-cycle is being managed) or they may be created directly as seen in the above example. Returning to the first question, "Why should the user want to use SharedMemory directly?", there are more use cases than these two:

    1. In process 1, a shared memory block is created by calling SharedMemoryManager.SharedMemory(). In process 2, we need to attach to that existing shared memory block and can do so by referring to its name. This is accomplished as in the above example by simply calling SharedMemory(name='uniquename'). We do not want to attach to it via a second SharedMemoryManager because only one manager should oversee the life-cycle of a single shared memory block.
    2. Sometimes direct management of the life-cycle of a shared memory block is desirable. For example, on systems supporting POSIX shared memory, it is a feature that shared memory blocks outlive processes. Some services choose to speed a service restart by preserving state data in shared memory, saving the newly restarted service from rebuilding it. The SharedMemoryManager provides one life-cycle strategy but can not cover all scenarios so the option to directly manage it is important.
    applio commented 5 years ago

    Regarding the API of the SharedMemory class, its flags, mode, and read_only parameters are not universally enforced or simply not implemented on all platforms that offer POSIX Shared Memory or Windows Named Shared Memory. A simplified API for the SharedMemory class that behaves consistently across all platforms would avoid confusion for users. For users who have specific need of flags/mode/read_only controls on a platform that they know does indeed respect that control, they should still have a mechanism to leverage those controls.

    I propose a simpler, consistent-across-platforms API like: SharedMemory(name=None, create=False, size=0)

    *name* and *size* retain their purpose though the former now defaults to None.
    *create* is set to False to indicate no new shared memory block is to be
    created because we only wish to attach to an already existing shared memory
    block.  When set to True, a new shared memory block will be created unless
    one already exists with the supplied unique name, in which case that block
    will be attached to and used.
    
    Example of attaching to an already existing shared memory block:
        SharedMemory(name='uniquename')
    Example of creating a new shared memory block where any new name will do:
        SharedMemory(create=True, size=128)
    Example of creating/attaching a shared memory block with a specific name:
        SharedMemory(name='specialsnowflake', create=True, size=4096)

    Even with its simplified API, SharedMemory will continue to be powered by PosixSharedMemory on systems where "POSIX Shared Memory" is implemented and powered by NamedSharedMemory on Windows systems. The API for PosixSharedMemory will remain essentially unchanged from its current form: PosixSharedMemory(name=None, flags=None, mode=0o600, size=0, read_only=False) The API for NamedSharedMemory will be updated to no longer attempt to mirror its POSIX counterpart: NamedSharedMemory(name=None, create=False, size=0, read_only=False)

    To be clear: the inconsistencies motivating this proposed API change is *not* only arising from differences between Windows and POSIX-supporting systems. For example, among systems implementing POSIX shared memory, the mode flag (which promises control over whether user/group/others can read/write to a shared memory block) is often but not always ignored; it differs from one OS to the next.

    Alternatives/variations to this proposed API change:

    giampaolo commented 5 years ago

    Because shared memory blocks are not "owned" by a single process, they are not destroyed/freed when a process exits.

    1) it seems SharedMemory.close() does not destroy the memory region (I'm able to re-attach to it via name). If I'm not mistaken only the manager can do that. As such it probably makes sense to have an unlink() or destroy() methods. Maybe SharedMemory should support the with statement.

    2) it seems SharedMemory is only supposed to handle memoryviews. I suggest to turn SharedMemory.buf in a read-onl property:

    >>> sm = shared_memory.SharedMemory(None, size=23)
    >>> sm.buf = [1,2,3]
    >>> sm.close()
      File "/home/giampaolo/svn/cpython-shm/Lib/multiprocessing/shared_memory.py", line 166, in close
        self.buf.release()
    AttributeError: 'list' object has no attribute 'release'

    3) it seems "size" kwarg cannot be zero (current default). Suggestion: either choose default != 0 or make it a mandatory arg:

    >>> shared_memory.SharedMemory(name=None)
    ValueError: cannot mmap an empty file
    >>> shared_memory.SharedMemory(name=None, size=0)
    ValueError: cannot mmap an empty file
    >>> shared_memory.SharedMemory(name=None, size=1)
    PosixSharedMemory('psm_32161_20838', size=1)
    >>> 

    4) I wonder if we should have multiprocessing.active_memory_children() or something, similar to multiprocessing.active_children() so that one can do:

    @atexit.register
    def cleanup():
        for x in multiprocessing.active_memory_children():
            x.close()  # or unlink/destroy

    I guess that would be useful for people not using a manager. Also, I'm not sure if active_memory_children() can return meaningful results with a brand new process (I suppose it can't).

    giampaolo commented 5 years ago

    Because shared memory blocks are not "owned" by a single process... [...] I propose a simpler, consistent-across-platforms API like: SharedMemory(name=None, create=False, size=0)

    Maybe something like this instead?

          SharedMemory(name=None, attach_if_exists=False, size=0)

    The use case I'm thinking about is 2 distinct processes/apps which agree on a common fixed name.

    giampaolo commented 5 years ago

    Hopefully my last iteration: =)

    1) As for SharedMemoryManager, I believe it should live in multiprocessing.managers, not shared_memory.py. It's a subclass of Manager and behaves like a manager (start(), shutdown(), get_server(), etc.), so IMO that's its natural place.

    2) Same for SharedMemoryServer (which is a subclass of multiprocessing.managers.Server). SharedMemoryTracker appears to be just a support class for it, so IMO it should either be private or not documented.

    3) ShareableList name is kinda inconsistent with other classes (they all have a "Shared" prefix). I'd call it SharedList instead.

    4) Ideally the (public) APIs I have in mind are:

    multiprocessing.managers.SharedMemoryManager
    multiprocessing.managers._SharedMemoryTracker
    multiprocessing.managers.SharedMemoryServer  (not documented)
    multiprocessing.shared_memory.SharedMemory
    multiprocessing.shared_memory.SharedList 
    multiprocessing.shared_memory.WindowsNamedSharedMemory  (maybe)
    multiprocessing.shared_memory.PosixSharedMemory  (maybe)

    AFAIU there are 2 distinct use-cases at play:

    4) I have some reservations about SharedMemory's "flags" and "mode" args.

    Hope this helps.

    applio commented 5 years ago

    @giampaolo:

    1) it seems SharedMemory.close() does not destroy the memory region (I'm able to re-attach to it via name). If I'm not mistaken only the manager can do that.

    Correct, close() does not and should not destroy the memory region because other processes may still be using it. Only a call to unlink() triggers the destruction of the memory region and so unlink() should only be called once across all the processes with access to that shared memory block.

    The unlink() method is available on the SharedMemory class. No manager is required. This is also captured in the docs.

    2) I suggest to turn SharedMemory.buf in a read-onl property

    Good idea! I will make this change today, updating python/cpython#56025.

    3) it seems "size" kwarg cannot be zero (current default)

    From the docs: When attaching to an existing shared memory block, set to 0 (which is the default).

    This permits attaching to an existing shared memory block by name without needing to also already know its size.

    4) I wonder if we should have multiprocessing.active_memory_children() or something

    I also think this would be helpful but...

    I'm not sure if active_memory_children() can return meaningful results with a brand new process (I suppose it can't).

    You are right. As an aside, I think it interesting that in the implementation of "System V Shared Memory", its specification called for something like a system-wide registry where all still-allocated shared memory blocks were listed. Despite the substantial influence System V Shared Memory had on the more modern implementations of "POSIX Shared Memory" and Windows' "Named Shared Memory", neither chose to make it part of their specification.

    By encouraging the use of SharedMemoryManager to track and ensure cleanup, we are providing a reliable and cross-platform supportable best practice. If something more low-level is needed by a user, they can choose to manage cleanup themselves. This seems to parallel how we might encourage, "when opening a file, always use a with statement", yet users can still choose to call open() and later close() when they wish.

    applio commented 5 years ago

    @giampaolo:

    Maybe something like this instead? SharedMemory(name=None, attach_if_exists=False, size=0)

    I think this misses the use case when wanting to ensure we only attach to an existing shared memory block and if it does not exist, we should raise an exception because we can not continue. (If the shared memory block should already be there but it is not, this means something bad happened earlier and we might not know how to recover here.)

    I believe the two dominant use cases to address are: 1) I want to create a shared memory block (either with or without a pre-conceived name). 2) I want to attach to an existing shared memory block by its unique name.

    applio commented 5 years ago

    @giampaolo:

    1) As for SharedMemoryManager, I believe it should live in multiprocessing.managers, not shared_memory.py.

    I am happy to see it live in multiprocessing.managers so long as we can provide a clean way of handling what happens on a platform where we can not support shared memory blocks. (We have implementations for PosixSharedMemory and NamedSharedMemory which together cover Windows, Linux, MacOS, the *BSDs, and possibly others but that does not cover everything.)

    @Neil has already raised this question of what do we want the behavior to be on these unsupported platforms on import? If everything dependent upon shared memory blocks remains inside shared_memory.py, then we could raise a ModuleNotFoundError or ImportError or similar when attempting to import shared_memory. If we move SharedMemoryManager to live in multiprocessing.managers, we need to decide how to handle (and communicate to the user appropriately) its potential absence. So far, I am unable to find a good example of another module where they have chosen to split up such code rather than keeping it all bottled up inside a single module, but perhaps I have missed something?

    2) Same for SharedMemoryServer (which is a subclass of multiprocessing.managers.Server).

    Same thing as above. If we decide how to handle the unsupported platforms on import, we can re-organize appropriately.

    3) ShareableList name is kinda inconsistent with other classes (they all have a "Shared" prefix). I'd call it SharedList instead.

    Oooh, interesting. I am happy to see a name change here. To share how I came up with its current name: I had thought to deliberately break the naming pattern here to make it stand out. The others, SharedMemory, SharedMemoryManager, and SharedMemoryServer, are all focused on the shared memory block itself which is something of a more primitive concept (like accessing SharedMemory.buf as a memoryview) compared to working with something like a list (a less primitive, more widely familiar concept). Likewise, I thought a dict backed by shared memory might be called a ShareableDict and other things like a NumPy array backed by shared memory might be called a ShareableNDArray or similar. I was hoping to find a different pattern for the names of these objects-backed-by-shared-memory-blocks, but I am uncertain I found the best name.

    4) I have some reservations about SharedMemory's "flags" and "mode" args.

    It sounds like you are agreeing with what I advocated in msg335660 (up above). Great!

    giampaolo commented 5 years ago

    The unlink() method is available on the SharedMemory class. No manager is required. This is also captured in the docs.

    I missed that, sorry.

    > 3) it seems "size" kwarg cannot be zero (current default) From the docs: When attaching to an existing shared memory block, set to 0

    Mmm... this makes the API a bit weird because it's a arg which is optional only for existing memory object, not new ones. I see that internally *size* is passed to ftruncate(). Can't you just avoid calling ftruncate() if size is not passed (None)? Also, what happens if you alter the size of an existing object with a smaller value? Is the memory region overwritten?

        a = shared_memory.SharedMemory(None, size=10)
        b = shared_memory.SharedMemory(a.name, size=4)

    > Maybe something like this instead? > SharedMemory(name=None, attach_if_exists=False, size=0) I think this misses the use case when wanting to ensure we only attach to an existing shared memory block and if it does not exist, we should raise an exception because we can not continue.

    It appears this is already covered:

        >>> SharedMemory(name="non-existent")
        Traceback (most recent call last):
        ...
        _posixshmem.ExistentialError: No shared memory exists with the specified name

    I believe the two dominant use cases to address are: 1) I want to create a shared memory block (either with or without a pre-conceived name). 2) I want to attach to an existing shared memory block by its unique name.

    Don't you also want to "create if it doesn't exist, else attach" as a single, atomic operation? It seems to me the only way to achieve that would be this, but it's racy:

    try:
        m = shared_memory.SharedMemory("foo", size=10)  # create
    except ExistentialError:
        m = shared_memory.SharedMemory("foo")  # attach (race condition)

    Note that here I'm entering into an unknown territory, because I'm not sure if there are or should be sync primitives to "wait for another memory to join me" etc. Just mentioning it for the sake of mental lucubration. =)

    If we move SharedMemoryManager to live in multiprocessing.managers, we need to decide how to handle (and communicate to the user appropriately) its potential absence.

    How about:

    # managers.py
    try:
        from .shared_memory import SharedMemory, SharedList 
    except ImportError:
        pass
    else:
        class SharedMemoryManager: ...
        class SharedMemoryServer: ...
        del SharedMemory, SharedList

    > 4) I have some reservations about SharedMemory's "flags" and "mode" args. It sounds like you are agreeing with what I advocated in msg335660 (up above). Great!

    If you mean drop those 2 args entirely then it probably there's no point in exposing WindowsNamedSharedMemory and PosixSharedMemory . If you mean to keep them then it seems "read_only=True" would conflict with "flags=O_RDWR". The fact that the Windows counterpart uses different constants makes me think this bitwise stuff should probably be handled internally, and exposed via friendly named-args like "read_only" / "attach_if_exists" or similar (but it's still not clear what those named args should be exactly).

    giampaolo commented 5 years ago

    As for *flags* argument "man shm_open" mentions 5 possible flags:

    As such I think "flags" should not be exposed. As for Windows, *read_only is currently exposed but ignored, and no other flags are contemplated. It seems you can implement *read_only by passing FILE_MAP_READ or FILE_MAP_WRITE to OpenFileMappingW() here: https://github.com/python/cpython/blob/1e5341eb171d7d2b988880020cfcc0a64021326d/Lib/multiprocessing/shared_memory.py#L98 Not sure how to reliably check if a file mapping already exists, but possibly POSIX and Windows should behave the same in this regard (and some test cases to exercise all the possible combinations would be good to have).

    applio commented 5 years ago

    @giampaolo:

    Also, what happens if you alter the size of an existing object with a smaller value? Is the memory region overwritten?

    Attaching to an existing shared memory block with a size=N which is smaller than its allocated size (say it was created with size=M and N\<M) will succeed in giving you access to at least the first N bytes of that shared memory block. The shared memory block allocated size will still be M bytes. That means future attempts to attach to it with size=M will continue to work as well. One motivation for why this is supported is, I believe, to deliberately limit access to part of the shared memory in some parts of a user's code, avoiding potential coding mistakes by design; for example, say the first 100KB contains unchanging reference data that many processes need to access but the second 100KB contains rapidly changing data that only a few processes should ever access or manipulate. I expect the most common use is to simply attach to the whole of a shared memory block, but I would not want to unnecessarily limit other established use cases.

    This behavior needed to be captured in the docs but I see it has not been! I have now added to the description of the size parameter and it should show up in python/cpython#56025 shortly.

    Can't you just avoid calling ftruncate() if size is not passed (None)?

    It looks like it does skip calling ftruncate() if size is 0.  From posixshmem.c:
            if (size) {
                DPRINTF("calling ftruncate, fd = %d, size = %ld\n", self->fd, size);
                if (-1 == ftruncate(self->fd, (off_t)size)) {

    > I think this misses the ... It appears this is already covered:

    Sorry for any confusion; I was interpreting your proposed parameter name, attach_if_exists, in the following way:

    Don't you also want to "create if it doesn't exist, else attach" as a single, atomic operation?

    Yes, I do! This was part of my description for the parameter named "create" in msg335660: When set to True, a new shared memory block will be created unless one already exists with the supplied unique name, in which case that block will be attached to and used.

    I'm not sure if there are or should be sync primitives to "wait for another memory to join me" etc.

    In the case of shared memory, I do not think so. I think such signaling between processes, when needed, can be accomplished by our existing signaling mechanisms (like, via the Proxy Objects for Event or Semaphore).

    giampaolo commented 5 years ago

    It looks like it does skip calling ftruncate() if size is 0. From posixshmem.c:

    Let me rephrase: are we forced to specify a value (aka call ftruncate()) on create ? If we are as I think, could size have a reasonable default value instead of 0? Basically I'm wondering if we can relieve the common user from thinking about what size to use, mostly because it's sort of a low level detail. Could it perhaps default to mmap.PAGESIZE?

    nascheme commented 5 years ago

    Some thoughts on this API.

    I think we need the "create with exclusive behavior" option, even though we don't know how to implement it on Windows right now. To me, there are two cases when calling SharedMemory:

    1) You want to create a new shared memory object. In this case, if your chosen name conflicts with an existing name, you should get an error. If you don't, you could be sharing data between unrelated processes. Chaos will ensue.

    2) You want to attach to an existing shared memory object. You have a name, passed to you by some means. In this case, you don't want to create a new object if it doesn't exist. You should get an error.

    I thought there might be a 3rd case where you have "co-equal" processes and any one of them could create and the others would attach. That dosen't seem safe though, if you can't ensure a unique name is used. You might attach to some unrelated shared memory object due to a name conflict. So, I think the API should not support this case.

    To support 1 & 2, we could just have 'create'. When true, it would act like O_CREX. When false, you would get an error if the name doesn't already exist.

    Regarding 'size', I think it is a bit weird how it currently works. Maybe 'size' should only be valid if you are creating a new shared memory object. If you are attaching to an existing one, size would be found from the existing object (if that's possible on all platforms). SharedMemory could grow a "resize()" method that lets you enlarge the underlying memory object. I don't know if that's useful in practice so maybe better to leave it out and keep the API simple.

    Should 'size' be a property that always does fstat() to find the size of the underlying file? Or, should it be the size passed in (for create=True) or the size of the file when the mmap is created. I'm not sure but maybe it is better to not always do the fstat().

    The 'read_only' keyword doesn't make sense if you care creating a new object and need to call ftruncate() on it. I think the OS will not allow that. Maybe 'read_only' should not be allowed if you are creating. Or, you should have a method that takes a read-write object and makes it read-only.

    On Linux at least, shm_open() is just a thin layer over open(). So the shared memory file is really just a regular file that is stored in /var/run/shm. If you realize that, it seems SharedMemory() is much like tempfile.NamedTemporaryFile(). E.g. if you create a NamedTemporaryFile under /var/run/shm it will behave like the file descriptor created by SharedMemory(). SharedMemory() objects have the difference that they don't clean themselves up. Also, they don't set the close-on-fork flag on the file descriptor. Maybe they should? It seems unclear to me how you should avoid cluttering /var/run/shm with shared memory objects that people forget to cleanup. I guess the plan is that people need to call unlink() at the right time. That seems error prone though.

    applio commented 5 years ago

    I think we need the "create with exclusive behavior" option, even though we don't know how to implement it on Windows right now.

    A fix to avoid the potential race condition on Windows is now part of python/cpython#56025.

    To support 1 & 2, we could just have 'create'. When true, it would act like O_CREX. When false, you would get an error if the name doesn't already exist.

    I am good with this and now it can be supported.

    a 3rd case where you have "co-equal" processes and any one of them could create and the others would attach.

    There are some practical use cases motivating this. Rather than debate the merits of those use cases, given the concern raised, perhaps we should forego supporting this 3rd case for now.

    Regarding 'size', I think it is a bit weird how it currently works. Maybe 'size' should only be valid if you are creating a new shared memory object.

    This would avoid potential confusion in the details of how attempts to resize do/don't work on different platforms. I would prefer to not need to explain that on MacOS, requesting a smaller size is disallowed. This defers such issues until considering a "resize()" method as you suggest.

    I like this.

    Should 'size' be a property that always does fstat() to find the size of the underlying file?

    The potential exists for non-Python code to attach to these same shared memory blocks and alter their size via ftruncate() (only on certain Unix platforms).

    We could choose to not support such "external" changes and let size be a fixed value from the time of instantiation. But I would like to believe we can be more effective and safely use fstat() behind our reporting of 'size'.

    It seems unclear to me how you should avoid cluttering /var/run/shm with shared memory objects that people forget to cleanup.

    This is the primary purpose of the SharedMemoryManager. Admittedly, we will not convince everyone to use it when they should, just like we are not able to convince everyone to use NamedTemporaryFile for their temp files.

    To update the proposed change to the API:

    Remaining question: do PosixSharedMemory and WindowsNamedSharedMemory mirror this simplified API or do we expose the added functionality each offers, permitting informed users to use things like 'mode' when they know it is enforced on a particular platform?

    applio commented 5 years ago

    The simpler API is now implemented in python/cpython#56025 as discussed previously. Notably:

    • We go with this simpler API: SharedMemory(name=None, create=False, size=0)
    • 'size' is ignored when create=False
    • create=True acts like O_CREX and create=False only attaches to existing shared memory blocks

    As part of this change, the PosixSharedMemory and WindowsNamedSharedMemory classes are no more; they have been consolidated into the SharedMemory class with a single, simpler, consistent-across-platforms API.

    On the SharedMemory class, 'size' is now stored by the __init__ and does not use fstat() as part of its property.

    Also, SharedMemoryManager (and its close friends) has been relocated to the multiprocessing.managers submodule, matching the organization @giampaolo outlined previously: multiprocessing.managers.SharedMemoryManager multiprocessing.managers._SharedMemoryTracker multiprocessing.managers.SharedMemoryServer (not documented) multiprocessing.shared_memory.SharedMemory multiprocessing.shared_memory.SharedList multiprocessing.shared_memory.WindowsNamedSharedMemory (REMOVED) multiprocessing.shared_memory.PosixSharedMemory (REMOVED)

    I believe this addresses all of the significant discussion topics in a way that brings together all of the excellent points being made. Apologies if I have missed something -- I did not think so but I will go back through all of the discussions tomorrow to double-check.

    giampaolo commented 5 years ago

    Code looks much better now. I'm still not convinced "SharedMemory(name=None, create=False, size=0)" is the best API. How are you supposed to "create or attach" atomically? You can do that with O_EXCL but as it stands this is not togglable via the API.

    Also, could you address my comment about size? https://bugs.python.org/issue35813#msg335731

    giampaolo commented 5 years ago

    Also, there is no way to delete/unwrap memory without using an existing SharedMemory instance, which is something we may not have on startup. Perhaps we should have a "shared_memory.unlink(name)" function similar to os.unlink() which simply calls C shm_unlink().

    applio commented 5 years ago

    Code looks much better now. I'm still not convinced "SharedMemory(name=None, create=False, size=0)" is the best API. How are you supposed to "create or attach" atomically?

    We are consciously choosing to not support an atomic "create or attach". This significantly simplifies the API and avoids the valid concerns raised around user confusion relating to that behavior (including the use of different specified 'size' values in a race) but does not preclude our potentially introducing this as a feature in the future.

    This simpler API still supports a "try: create; except: attach" which is not atomic but effectively covers the primary use cases for "create or attach". Combined with a SyncManager.Lock, users can already achieve an atomic "create or attach" using this simpler API.

    Also, could you address my comment about size? https://bugs.python.org/issue35813#msg335731 > Let me rephrase: are we forced to specify a value (aka call > ftruncate()) on create ? If we are as I think, could size have a > reasonable default value instead of 0? Basically I'm wondering if we > can relieve the common user from thinking about what size to use, > mostly because it's sort of a low level detail. Could it perhaps > default to mmap.PAGESIZE?

    Apologies for not responding to your question already, Giampaolo.

    For the same reasons that (in C) malloc does not provide a default size, I do not think we should attempt to provide a default here. Not all platforms allocate shared memory blocks in chunks of mmap.PAGESIZE, thus on some platforms we would unnecessarily over-allocate no matter what default size we might choose. I do not think we should expect users to know what mmap.PAGESIZE is on their system. I think it is important that if a user requests a new allocation of memory, that they first consider how much memory will be needed. When attaching to an existing shared memory block, its size is already defined.

    I think this even fits with CPython's over-allocation strategies behind things like list, where an empty list triggers no malloc at all. We will not allocate memory until the user tells us how much to allocate.

    Also, there is no way to delete/unwrap memory without using an existing SharedMemory instance, which is something we may not have on startup. Perhaps we should have a "shared_memory.unlink(name)" function similar to os.unlink() which simply calls C shm_unlink().

    It is not really possible to offer this on non-POSIX platforms so I think we should not attempt to offer a public "shared_memory.unlink(name)". It is possible to invoke "shm_unlink" with the name of a shared memory block (for those who really need it) on platforms with POSIX Shared Memory support via: shared_memory._posixshmem.shm_unlink('name')

    giampaolo commented 5 years ago

    We are consciously choosing to not support an atomic "create or attach". This significantly simplifies the API and avoids the valid concerns raised around user confusion relating to that behavior (including the use of different specified 'size' values in a race) but does not preclude our potentially introducing this as a feature in the future.

    I understand that because of *size* we cannot solve the race condition issue unless the user uses some sort of synchronization mechanism. FWIW I bumped into this lib: http://semanchuk.com/philip/sysv_ipc/ ...which provides two separate APIs to "create" and "attach":

    >> SharedMemory("name", IPC_CREX) >> attach("name")

    At this point I'm agnostic about the API, which is probably just a matter of personal taste (e.g. one may prefer a separate SharedMemory.attach() classmethod or a *mode* argument accepting "x" and "a"). I see that that lib use shmat() on attach and shmdt() on detach. I'm not sure if that makes a difference, just mentioning it because your implementation doesn't do that on close() and perhaps it should.

    Combined with a SyncManager.Lock, users can already achieve an atomic "create or attach" using this simpler API.

    That assumes a single app/process which spawns a child (the "worker"). In that case SyncManager.Lock/others is indeed compatible with SharedMemory and can be used to implement non-racy "create or attach" and also to synchronize memory access on read and write. But AFAICT there are two uses cases for this functionality, and there currently is no mechanism to do any that if you have two unrelated apps/processes relying on a common shared memory *name and *size. I'm taking a peek at "man smh_overview" which says:

    \<\<Typically, processes must synchronize their access to a shared memory object, using, for example, POSIX semaphores. System V shared memory (shmget(2), shmop(2), etc.) is an older shared memory API. POSIX shared memory provides a simpler, and better designed interface; on the other hand POSIX shared memory is somewhat less widely available (especially on older systems) than System V shared memory.>>

    That would translate into a new Semaphore(name=None, create=False) class which (possibly?) would also provide better performances compared to SyncManager.Semaphore. Not sure if we can do the same on Windows though.

    Extra 1: apparently there are also POSIX msgget(), msgrcv() and msgsnd() syscalls which could be used to implement a System-V message Queue similar to SyncManager.Queue later on.

    Extra 2: given the 2 distinct use-cases I wonder if the low-level component (shared_memory.py) really belongs to multiprocessing module. Perhaps this should be provided as a separate "sharedmemory" module with multiprocessing.managers.SharedMemoryMemory being the high-level interface.

    3da0707d-f262-40c9-bcc7-95cefbde6746 commented 5 years ago

    On Feb 23, 2019, at 10:40 AM, Giampaolo Rodola' \report@bugs.python.org\ wrote:

    Giampaolo Rodola' \g.rodola@gmail.com\ added the comment:

    > We are consciously choosing to not support an atomic "create or attach". This significantly simplifies the API and avoids the valid concerns raised around user confusion relating to that behavior (including the use of different specified 'size' values in a race) but does not preclude our potentially introducing this as a feature in the future.

    I understand that because of *size* we cannot solve the race condition issue unless the user uses some sort of synchronization mechanism. FWIW I bumped into this lib: http://semanchuk.com/philip/sysv_ipc/ ...which provides two separate APIs to "create" and "attach":

    >>> SharedMemory("name", IPC_CREX) >>> attach("name")

    At this point I'm agnostic about the API, which is probably just a matter of personal taste (e.g. one may prefer a separate SharedMemory.attach() classmethod or a *mode* argument accepting "x" and "a"). I see that that lib use shmat() on attach and shmdt() on detach. I'm not sure if that makes a difference, just mentioning it because your implementation doesn't do that on close() and perhaps it should.

    attach() and detach() are particular to SysV IPC which is different from the POSIX IPC that’s being used here. There’s no need for attach() and detach() with POSIX shared memory.

    POSIX IPC is generally simpler than SysV IPC, in part because it was developed after SysV IPC so the developers had the benefit of experience with the older API.

    Side note: I’m the author of the sysv_ipc package you found, as well as the posix_ipc package.

    applio commented 5 years ago

    FWIW I bumped into this lib: http://semanchuk.com/philip/sysv_ipc/

    The author of that lib, Philip Semanchuk, is one of the people participating in this effort -- he has posted above in msg334934 here on b.p.o. and has helped review the PR in python/cpython#56025.

    He is also the author of the posix_ipc package which was the original basis for our POSIX Shared Memory implementation here.

    The decision to base our Unix platform support upon POSIX and not SystemV libraries came after considerable research and there are important differences between the two. To oversimplify: POSIX Shared Memory support has now been available for some time on Linux, *BSD, MacOS, and others and is something of a successor to the SystemV.

    That assumes a single app/process which spawns a child (the "worker").

    Not true. A manager started by one process can be connected to by another process that is not a child. This is covered in the docs here: https://docs.python.org/3/library/multiprocessing.html#using-a-remote-manager That child can then request that shared memory blocks it creates be remotely tracked and managed by that remote process's manager. While I would not expect this to be a common use case, this is a feature of BaseManager that we inherit into SharedMemoryManager.

    The SyncManager.Lock can be used as part of this as well. Thus, two unrelated apps/processes *can* coordinate their management of shared memory blocks through the SharedMemoryManager.

    That would translate into a new Semaphore(name=None, create=False) class which (possibly?) would also provide better performances compared to SyncManager.Semaphore

    Right! You might have noticed that Philip has such a semaphore construct in his posix_ipc lib.

    I opted to not attempt to add this feature as part of this effort to both (1) keep focused on the core needs to work with shared memory, and (2) to take more time in the future to work out how to get cross-platform support for the semaphore right (as you point out, there are complications to work through).

    Extra 1: apparently there are also POSIX msgget(), msgrcv() and msgsnd() syscalls which could be used to implement a System-V message Queue similar to SyncManager.Queue later on.

    Right! This is also something Philip has in his posix_ipc lib. This should be part of the roadmap for what we do next with SharedMemory. This one may be complicated by the fact that not all platforms that implement POSIX Shared Memory chose to also implement these functions in the same way. We will need time to work out what we can or can not reasonably do here.

    Extra 2: given the 2 distinct use-cases I wonder if the low-level component (shared_memory.py) really belongs to multiprocessing module

    Given what I wrote above about how multiprocessing.managers does enable these use cases and the existing "distributed shared memory" support in multiprocessing, I think it logically belongs in multiprocessing. I suggest that "shm_open" and "shm_unlink" are our low-level tools, which appropriately are in _posixshmem, but SharedMemory and the rest are high-level tools; SharedMemoryManager will not be able to cover all life-cycle management use cases thus SharedMemory will be needed by many and in contrast, "shm_open" and "shm_unlink" will be needed only by those wishing to do something wacky. (Note: I am not trying to make "wacky" sound like a bad thing because wacky can be very cool sometimes.)

    Philip's ears should now be burning, I mentioned him so many times in this post. Ah! He beat me to it while I was writing this. Awesome!

    We would not be where we are with SharedMemory without his efforts over many years with his posix_ipc lib.