Closed sergiusz-n closed 5 months ago
Can you test with this PR please: https://github.com/openssl/openssl/pull/23723 I think it will fix the discrepancy
was the issue corrected with the patch?
@nhorman thanks for the quick response. the code changes seem to be promising. it will take a couple of days to verify it in practice. I'll get back to you once I am done.
There's also a breaking change between 3.1 and 3.2 with KTLS, nonblocking and SSL_write returning SSL_EROR_SSL and errno EAGAIN. Version 3.1 didnt have this behaviour. It looks like it returned SSL_ERROR_WANT_WRITE directly.
Sidenote: The captcha for registering an account with github is insane. I dont want to spend 5 minutes clicking on dart-boards.
was the issue corrected with the patch?
@nhorman
the patch has been tested against EPIPE failure scenario.
Although the SSL_get_error()
is become SSL_ERROR_SYSCALL
with the patch applied, there is a tiny discrepancy remained.
SSL_write()
: the error value retrieved via _ERR_peekerror() is equal to 0, and the system error is available in errno
only. for SSL_sendfile()
: _ERR_peekerror() is non-zero (0x80000020
), and ERR_GET_REASON(ERR_peek_error()) == errno
(EPIPE = 0x20)
the question is as follows:
in order to have a common error handling in the application what should be a reliable source for error reason: errno
or ERR_GET_REASON(...)
?
Please advise.
I have just grep'd throughout the code and found that the pattern:
ERR_raise_data(ERR_LIB_SYS, get_last_socket_error(),
"calling some_socket_system_function()");
is widely used in crypto/bio/*.c
files for the most of socket operations like ioctl(), getsockopt(), accept(), connect() etc, but never met when doing write/read.
was it done intentionally or omitted?
Thats odd, I made no changes to SSL_write, but you noted in your initial post that an error on SSL_write SSL_get_error returns SSL_ERROR_SYSCALL, which is retrieved based on a call to ERR_peek_error(), so its not making much sense to me that now its not returning an error at all ?
@nhorman let me rephrase it and try to put it clearly to avoid possible confusion
Considering the same broken pipe failure scenario, the behaviour is as follows: openssl-3.2.1 baseline:
SSL_get_error() == SSL_ERROR_SYSCALL
, ERR_peek_error() == 0
, and errno == EPIPE
;SSL_get_error() == SSL_ERROR_SSL
, ERR_GET_REASON(ERR_peek_error()) == EPIPE
, and errno == EPIPE
;openssl-3.2.1 with patch applied:
SSL_get_error() == SSL_ERROR_SYSCALL
, ERR_peek_error() == 0
, and errno == EPIPE
;SSL_sendfile() returns < 0,
SSL_get_error() == SSL_ERROR_SYSCALL
, ERR_GET_REASON(ERR_peek_error()) == EPIPE
, and errno == EPIPE
;
Thus, the concern is that in case of SSL_write()
the error is not set and _ERR_peekerror() returns 0, that's why the reason might be retrieve from errno only. When it comes to SSL_sendfile()
regardless of what _SSL_geterror() returns, the error is set, and the reason is available via ERR_GET_REASON(ERR_peek_error())
and matches errno
value.
If an error occurs during ioctl(), getsockopt(), accept(), connect() system calls, the reason is set to errno, which is not the case for _SSL_read()/SSLwrite() when ERR_peek_error() is equal to zero.
I would expect SSL_sendfile()
to resemble SSL_write()
behaviour in that regard, that's why I reported "tiny discrepancy remained". I hope now it has been stated clearly.
ok, thank you for the clarification, so in both cases ERR_peek_error on the SSL_write case, regardless of the patch status returns 0 for you. That makes more sense.
Though, its still confusing. SSL_get_error calls ossl_ssl_get_error to determine the error return code, and SSL_ERROR_SYSCALL is returned if and only if the following conditions are met: 1) the passed in check_err parameter is 1 (which it is) 2) ERR_peek_error does not return zero 3) The ERR_GET_LIB value of the value returned from ERR_peek_error is ERR_LIB_SYS
That suggests to me that the error stack is being cleared between the time you call SSL_get_error and the time you call ERR_peek_error.
do you happen to have a small reproducer I can use to try this here?
@nhorman
SSL_ERROR_SYSCALL is returned if and only if the following conditions are met:
- the passed in check_err parameter is 1 (which it is)
- ERR_peek_error does not return zero
- The ERR_GET_LIB value of the value returned from ERR_peek_error is ERR_LIB_SYS
it seems not to be a true statement. there are other possibilities for ossl_ssl_get_error ()
to return _SSL_ERRORSYSCALL. What I believe is happening when SSL_write()
encounters EPIPE error, ERR_raise_data()
is not invoked, ERR_peek_error()
thereby is zero, and SSL_ERROR_SYSCALL
comes out from the very last return of this function: ssl_lib.c:4703
.
As I mentioned when handling negative retvalue during write socket operation, only check against non-fatal errno values is made, and no error raised
Please refer to int sock_write(BIO *b, const char *in, int inl)
implementation.
And you may compare how socket connect()
returning -1 is handled here. (i.e. the error is raised unless errno
is non-fatal, retriable.)
@nhorman
I have come up with some artificial failure snippet based on sslecho client/server sample from the demos
folder in the repo.
Please find it here.
thank you, I see whats happening now. Let me see what I can do here
updated the PR to take care of this
After upgrading from openssl1.0.x in the application implementing some HTTPS server, we attempted to substitiute
SSL_write()
calls in favour ofSSL_sendfile()
. Skipping the details the transition went smoothly, however, some discrepancy was noticed in the discrepancy in a few failure scenarios. For instance, TCP connection drop by the peer scenario while doing write operationSSL_write()
ends up withSSL_get_error() == SSL_ERROR_SYSCALL
anderrno == EPIPE
;SSL_sendfile()
ends up withSSL_get_error() == SSL_ERROR_SSL
anderrno == EPIPE
.would it be better for
SSL_sendfile
to raiseSSL_ERROR_SYSCALL
error instead, considering the fact they mostly come out s sendfile() system call. It was confusing (to say the least) to seeerror:0A000114:SSL routines::uninitialized
for the session, where previous writes were successful. The error handing is quite poor here in ssl_lib.c, onlyEAGAIN
/EBUSY
/EINTR
seem to be differentiated. Please advise.Definition of Done