python / cpython

The Python programming language
https://www.python.org
Other
62.32k stars 29.94k forks source link

os.writev() behaviour unpredictable when hitting file system full (28) #99691

Open FredericHemmer opened 1 year ago

FredericHemmer commented 1 year ago

Bug report

os.writev() has an unpredictable behaviour when writing on a file system that becomes full:

For example:

try:
    fd=os.open(file,  os.O_WRONLY|os.O_CREAT|os.O_APPEND)
    result = os.writev(fd,iovtable[0:MAXIOV])
except OSError as e:
    ...
    # at this point result might return some bytes written (not necessarly corresponding to iov lengths, or even not be defined at all.

Your environment

centos 7. Python 3.9 (but probably the same on 3.11)

This can easily be worked around by specifying in the documentation that os.writev() returns the actual number of bytes written in case of success, but the contents of the return value are unpredicatbale in case of errors.

ronaldoussoren commented 1 year ago

Your example code fragment is not entirely clear on one point: the location of the comment.

With the provided indentation (in the "except" block "result" will never be set if os.writev raises. After the block (e.g. dedent the comment one level) the result variable should exist when data was written and will not be set if an exception was caught and ignored.

The os.writev function is a thin wrapper around writev(2), and the contract for that function is that it will return the number of bytes actually written and will return -1 if no bytes could be written. E.g. partial writes should not result in an error code and hence not in a Python exception for os.writev.

FredericHemmer commented 1 year ago

Dear Ronald,

Thank you for your prompt answer. You are right of course; this is the problem of trying to send a simple example out of a more complex code. The problem rather lies in writev(2), whose documentation states that “The data transfers performed by readv() and writev() are atomic”, which does not seem to be the case.

There is indeed no problem with Python per se. However, the documentation states:

This is not completely the case for os.writev(fd, buffers, /), for which bytes like objects have to be passed as opposed to an explicit *iov, iovcnt contruct.

My suggestion was to perhaps to make it more clear in the documentation that the semantics of os.writev() will be identical to its writev(2) counterpart and that partial writes are indeed possible (and not necessarily corresponding to full buffers boundaries).

In any case, your answer was extremely useful and has allowed me to work around the unexpected behaviour of writev(2).

Thanks!

Frédéric

-- Frédéric Hemmer tel: +41-22-7676104<tel:+41-22-7676104> CERN, Experimental Physics Department LHCb Experiment CH-1211 Geneva 23 email: @.**@.> Switzerland http://cern.chhttp://cern.ch/

From: Ronald Oussoren @.> Sent: 23 November 2022 10:13 To: python/cpython @.> Cc: Frederic Hemmer @.>; Author @.> Subject: Re: [python/cpython] os.writev() behaviour unpredictable when hitting file system full (28) (Issue #99691)

Your example code fragment is not entirely clear on one point: the location of the comment.

With the provided indentation (in the "except" block "result" will never be set if os.writev raises. After the block (e.g. dedent the comment one level) the result variable should exist when data was written and will not be set if an exception was caught and ignored.

The os.writev function is a thin wrapper around writev(2), and the contract for that function is that it will return the number of bytes actually written and will return -1 if no bytes could be written. E.g. partial writes should not result in an error code and hence not in a Python exception for os.writev.

— Reply to this email directly, view it on GitHubhttps://github.com/python/cpython/issues/99691#issuecomment-1324748912, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AETTHGAWTXLWSO75AVOBAHTWJXN3BANCNFSM6AAAAAASH4TB7I. You are receiving this because you authored the thread.Message ID: @.**@.>>

ronaldoussoren commented 1 year ago

"Data transfers performed by readv and writtev" means that the calls to these systems calls are serialised by the kernel, you can't end up with a file where blocks from two threads are interleaved, you'll always get all blocks from one call to writev and then blocks from another call (assuming there are no errors that result in partial writes).

ronaldoussoren commented 1 year ago

Adding the docs label (again...) because the description of what the function does can be clearer than it is now. I don't have a proposal for a clearer description though.

FredericHemmer commented 1 year ago

I can perhaps suggest the following:

os.writev(fd, buffers, /) Write the contents of buffers to file descriptor fd. buffers must be a sequence of bytes-like objects. Buffers are processed in array order. Entire contents of the first buffer is written before proceeding to the second, and so on. The last buffer may be partially written (e.g. when hitting the file system full or when reaching operating system limits).

Returns the total number of bytes actually written.

The operating system may set a limit (sysconf() value 'SC_IOVMAX') on the number of buffers that can be used**, or even a limit on the maximum number of bytes that can be written in one call (such as 2147479552 on Linux)_**