python / cpython

The Python programming language
https://www.python.org
Other
62.46k stars 29.98k forks source link

ftplib doesn't check close status after sending file #54411

Open 11ecd6cf-9a95-40f1-8f6c-6275193baf6b opened 13 years ago

11ecd6cf-9a95-40f1-8f6c-6275193baf6b commented 13 years ago
BPO 10202
Nosy @loewis, @gpshead, @pitrou, @giampaolo, @ethanfurman, @MaxwellDupre
PRs
  • python/cpython#30747
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-bug', 'library', '3.9', '3.10', '3.11'] title = "ftplib doesn't check close status after sending file" updated_at = user = 'https://bugs.python.org/nagle' ``` bugs.python.org fields: ```python activity = actor = 'mikecmcleod' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'nagle' dependencies = [] files = [] hgrepos = [] issue_num = 10202 keywords = ['patch'] message_count = 10.0 messages = ['119638', '119652', '119658', '119669', '119685', '119720', '119788', '406207', '409157', '410644'] nosy_count = 8.0 nosy_names = ['loewis', 'gregory.p.smith', 'exarkun', 'nagle', 'pitrou', 'giampaolo.rodola', 'ethan.furman', 'mikecmcleod'] pr_nums = ['30747'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue10202' versions = ['Python 3.9', 'Python 3.10', 'Python 3.11'] ```

    11ecd6cf-9a95-40f1-8f6c-6275193baf6b commented 13 years ago

    "ftplib" doesn't check the status on socket close after writing. This can lead to silently truncated files when sending files with "ftplib".

    A report of truncated files on comp.lang.python led me to check the source code.

    The "ftplib" module does sending by calling sock_sendall in "socketmodule.c". That does an OS-level "send", and once everything has been sent, returns.

    But an OS-level socket send returns when the data is queued for sending, not when it is delivered. Only when the socket is closed, and the close status checked, do you know if the data was delivered. There's a final TCP close handshake that occurs when close has been called at both ends, and only when it completes successfully do you know that the data has been delivered.

    At the socket level, this is performed by "shutdown" (which closes the connection and returns the proper network status information), or by "close".

    Look at sock_close in "socketmodule.c". Note that it ignores the return status on close, always returns None, and never raises an exception. As the Linux manual page for "close" says: "Not checking the return value of close() is a common but nevertheless serious programming error. It is quite possible that errors on a previous write(2) operation are first reported at the final close(). Not checking the return value when closing the file may lead to silent loss of data."

    "ftplib", in "storlines" and "storbinary", calls "close" without checking the status or calling "shutdown" first. So if the other end disconnects after all data has been queued locally but not sent, received and acknowledged, the sender will never know.

    pitrou commented 13 years ago

    It sounds more like an issue in socket.close(), right?

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 13 years ago

    I don't quite understand the problem. How exactly do you manage to lose data? The ftp server should send a 226 status code to indicate success over the control connection, and presumably it will do so only after receiving the proper TCP shutdown from its operating system. So regardless of any error handling that ftplib or .close() may fail to perform, I doubt that the problem can actually arise.

    As for error checking in close(): I think this is a minor issue for this discussion. Usually, close() will succeed *without sending all data to the remote system. So looking at the close() status will *not tell you whether all data has been delivered. You would have to use the SO_LINGER socket option for that.

    Even with SO_LINGER turned on, it's pretty pointless to check the close status. While we would be able to tell whether the data has arrived at the remote system, we can't know whether the ftp server has actually read them out of the socket, and written them to disk. So we absolutely need the ftp protocol status to determine the outcome of a STOR (say).

    11ecd6cf-9a95-40f1-8f6c-6275193baf6b commented 13 years ago

    Proper behavior for ftplib when sending is to send all desired data, then call "sock.shutdown(socket.SHUT_RDWR)". This indicates that no more data will be sent, and blocks until the receiver has acknowledged all their data.

    "socketmodule.c" handles this right. "shutdown" is called on the socket, and the return value is checked. If the return value is negative, an error handler is returned. Compare the handling in "close".

    FTP send is one of the few situations where this matters, because FTP uses the close of the data connection to indicate EOF.

    giampaolo commented 13 years ago

    Proper behavior for ftplib when sending is to send all desired data, then call "sock.shutdown(socket.SHUT_RDWR)". This indicates that no more data will be sent, and blocks until the receiver has acknowledged all their data.

    I'm not sure about this. Such a requirement should be respected by both peers and AFAICR there's no FTP-related RFC which explicitly states that shutdown() should be used.

    http://www.rfc-archive.org/getrfc.php?rfc=4217 chapter 12.4 should reflect the same use case we're talking about: a client sending a file (STOR). In the example only close() is used, which perhaps should lead this discussion to question whether socket's close() method is implemented properly, as your linux man quote suggests.

    11ecd6cf-9a95-40f1-8f6c-6275193baf6b commented 13 years ago

    Re: "Such a requirement should be respected by both peers and AFAICR there's no FTP-related RFC which explicitly states that shutdown() should be used.".

    That's because the FTP spec (RFC 959) talks about TCP in reference to the TCP specification, not the Berkeley Sockets implementation. "shutdown" is a Berkeley Sockets interface feature, and isn't formally standardized. Which is why it isn't mentioned in RFCs.

    The issue here is that an FTP data connection send is one of the few places where 1) the sender is indicating an EOF by closing the connection, and 2) the sender cares about complete delivery to the receiver. (HTTP servers don't care if the client disconnects before transfer is complete.)

    The TCP spec (the original RFC 793) says "A TCP will reliably deliver all buffers SENT before the connection was CLOSED so a user who expects no data in return need only wait to hear the connection was CLOSED successfully to know that all his data was received at the destination TCP." That's the way it was supposed to work when the FTP spec was written. In other words, "close" is supposed to wait for completion of the TCP close handshake.

    Today, it's common to close connections to abandon them, in which case there's no need to wait for the close handshake to complete. This resulted in a split at the socket level - there's "close", "SO_LINGER" to affect the wait for the close handshake, and an optional timeout. Then there's "shutdown" for when you explicitly care about what's happening during close.

    Python's socket implementation is written with "close" as a "don't care" operation. Since many Python closes happen implicitly, when a reference count goes to 0, this is appropriate; raising an exception during deallocation generally surprises the user. So it's implicit in the Python interface that, if you're using connection close to signal EOF and care about delivery, you should use "shutdown".

    So I'd put a shutdown call in "ftplib" at the end of each data connection send. That will block at the end of the transfer and raise an exception (I think that's the way the socket library is coded) if the close handshake doesn't finish cleanly.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 13 years ago

    Proper behavior for ftplib when sending is to send all desired data, then call "sock.shutdown(socket.SHUT_RDWR)". This indicates that no more data will be sent, and blocks until the receiver has acknowledged all their data.

    I think you misunderstand. Calling shutdown does *not* block until the receiver has acknowledged all data. It just put a FIN packet into the send queue.

    FTP send is one of the few situations where this matters, because FTP uses the close of the data connection to indicate EOF.

    Not only. It also uses the server response on the control connection to indicate that all data has been received. Relying on successful shutdown is both insufficient and unnecessary.

    44bd1bb3-c92e-4eb8-8701-1901a18bceda commented 2 years ago

    I would like to help on this issue. I understand the arguments here but it has been a lone time since this was raised and there does not seem to be any further issues discussed or support for this issue.

    gpshead commented 2 years ago

    [pasting in my investigation that I replied with on a mailing list:]

    The possible problem does appear real but it likely not frequently observed and is something most people reading the code wouldn't see as it's rather esoteric:

    https://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable via https://stackoverflow.com/questions/8874021/close-socket-directly-after-send-unsafe describes the scenario.

    So does this apply in Python?

    Apparently. Our socket module close() is simply calling close as expected. https://github.com/python/cpython/blob/main/Modules/socketmodule.c#L3132 -> https://github.com/python/cpython/blob/main/Modules/socketmodule.c#L443

    Our ftplib code that implicitly calls close on the data connection when exiting the with self.transfercmd(cmd, rest) as conn: context manager https://github.com/python/cpython/blob/main/Lib/ftplib.py#L498`

    Triggers a close() without a prior shutdown(socket.SHUT_RDWR) + read() -> EOF or timeout dance.

    In most protocols this isn't a big deal. ftp being so old as to rely solely on the TCP connection itself is the outlier.

    How often does this happen in practice? I don't know. I haven't heard of it happening, but how many people honestly use ftplib to send data between operating systems where a socket close triggering a TCP RST rather than FIN on the sender might be more likely in the past 20 years? I suspect not many.

    The code can be improved to prevent it. But I don't believe it'll be feasible to construct a real world not-mock-filled regression test that would fail without it.

    Potential solution: Do the shutdown+recv EOF dance as the final step inside of the storbinary and storlines with self.transfercmd as conn blocks.

    Technically speaking socket.socket.shutdown() is conditionally compiled but I don't know if any platforms lacking the socket shutdown API exist anymore (my guess is that conditional compilation for shutdown is a leftover from the 1990s). If so, a "if hasattr(conn, 'shutdown'):" conditional before the added logic would suffice.

    Looking at various ftp client source code (netkit ftp & netbsd ftp - both bsd derivatives) as well as a modern gonuts golang one I do not see them explicitly doing the shutdown dance. (I didn't dive in to figure out if it is hidden within their flclose() or conn.Close() implementations as that'd be surprising)

    44bd1bb3-c92e-4eb8-8701-1901a18bceda commented 2 years ago

    Working.. should be able to create pull request soon. Note part of suggestions include using SIOCOUTQ, but this does not have an equivalent for windows. And as Murphy's law goes this is likely to be where the problem is!