python / cpython

The Python programming language
https://www.python.org
Other
62.87k stars 30.12k forks source link

Unfixable potential race conditions on file descriptors in CPython #121544

Open colesbury opened 3 months ago

colesbury commented 3 months ago

Python allows multiple threads to concurrently access file descriptors through files and sockets. Race conditions at the Python level can lead to unexpected behavior, even with the global interpreter lock.

Thread sanitizer reports these races in some of our tests that exercise this behavior. We cannot fix these potential race conditions without introducing potential deadlocks.

For example, consider:

import threading

secrets = open("secrets.txt", "w");

def thread1():
    secrets.write("a secret")

def thread2():
    secrets.close()

def thread3():
    with open("log.txt", "w") as log:
        log.write("log message")

threading.Thread(target=thread1).start()
threading.Thread(target=thread2).start()
threading.Thread(target=thread3).start()

If you are particularly unlucky, then the file descriptor for secrets may be closed by thread2 and reused as the file descriptor for log just before thread1 writes to it. In other words, thread1 may write "a secret" to a completely different file or socket than it intended.

This can happen, even with the GIL, because the GIL is released around write() and close(). In other words, the following can happen:

  1. The secrets.write() call releases the GIL, but before it actually calls the C write() function on the file descriptor....
  2. The secrets.close() call closes the file descriptor
  3. The open("log.txt", "w") re-uses the same file descriptor number
  4. The secrets.write() call now write() on the wrong file descriptor

Note that we must release the GIL before calling write() or close() because these functions potentially block.

serhiy-storchaka commented 3 months ago

What concrete parts of the stdlib or tests have this race condition?

There were some issues in multiprocessing, I tried to avoid introducing such kind of race condition.

colesbury commented 3 months ago

@serhiy-storchaka I don't think the stdlib has internal races on file descriptors. We have some in tests, but it may take me a bit to find them because we still have other race conditions reported by TSan, and it's easy to get them mixed up.

test_socket.testClose - race between recv() and connect() reported by TSan. This seems benign to me -- unlike my original example, neither call creates or closes file descriptors -- but I'm not entirely sure.