Closed mumbleskates closed 3 years ago
Thanks for pointing this out... (after pondering for a bit) I think I agree it should handle the case of the server not returning the right amount of bytes.
Small thing: I'm not sure about the check on type(data) is not bytes
... a well behaving HTTP client shouldn't return undefined as the body, e.g. if there are no bytes in the body, it would be the empty byte string b'', rather than undefined (looking at https://www.python-httpx.org/api/#response it does seem to suggest content
is always bytes
)?
Also, I think I'm fairly pro sticking with the lambda as-is, pro not using another exception, and also pro being quite explicit about when the error codes are returned. So something like:
try:
data = get_range(i_ofst, i_ofst + i_amt - 1)
except Exception:
return SQLITE_IOERR_SHORT_READ
if len(data) != amt:
return SQLITE_IOERR_SHORT_READ
memmove(p_out, data, i_amt)
return SQLITE_OK
I'm not sure what kinds of HTTP client you anticipate being used here. If the contract is that the content
attribute can only ever contain a bytes object, never an interpretation of the response's encoding, then that's probably fine. It seems sensible and is very inexpensive to at least check.
As for the lambda, get_range
is the only lambda in the entire VFS. Additionally seems to stand to reason that its responsibility is to actually fetch the byte range from the remote web service. If it gets a different number of bytes, clearly something went wrong, just as much as it has if it returns a 403 or a 5XX error. Leaking concerns about whether or not get_range
actually worked or whether it has produced a totally different result because of remote behavior you need to think about seems bad. Wrong-sized data is an exceptional case, so just raise an exception.
So anyway, feel free to take whatever option you wanted with most of this but the library is not safe to use as-is.
Understood. Will merge in + make changes
Thanks! Looks like it's undergone significant changes now anyway
do not trust the s3 service to return the exact right amount of data. we do not even require that the url have a secure scheme, so we definitely should not be implicitly trusting that the correct amount of data came back from the remote service before copying the presumed amount of memory.