Closed Metamess closed 6 months ago
Actually, the exception sounds to me as if the header went in fine, but the content was missing. Maybe this means, that the data (which is a in-memory file-like object for simple_upload) needs a seek(0) before retrying. I can see this happening if in the first call, all the data was sent, but the request failed to complete after this (as opposed to an error setting up the connection, which is probably more common).
--- a/gcsfs/core.py
+++ b/gcsfs/core.py
@@ -421,6 +421,8 @@ class GCSFileSystem(asyn.AsyncFileSystem):
self, method, path, *args, headers=None, json=None, data=None, **kwargs
):
await self._set_session()
+ if hasattr(data, "seek"):
+ data.seek(0)
async with self.session.request(
method=method,
Welp, now I feel stupid for misinterpreting the error message 🤦 Thanks for the reply @martindurant , and the suggested code change!
I'm a bit at the edge of my knowledge and understanding here, so I apologize if this doesn't make sense, but I am left wondering: What would cause a request to fail if the data was all sent, and thus the request has effectively finished? Perhaps there is some other root cause that should be fixed to prevent this situation from occurring in the first place? (Sadly I don't have any debug logs from the gcsfs library itself when I encountered these errors, and I have been unable to reproduce them on demand ☹️) Furthermore, could we be introducing a problem if we resend the data, if it was already fully received (and stored)?
Lastly, I also noticed that simple_upload
wraps the data into an UncloseableBytesIO
instance:
class UnclosableBytesIO(io.BytesIO):
"""Prevent closing BytesIO to avoid errors during retries."""
def close(self):
"""Reset stream position for next retry."""
self.seek(0)
Which seems to suggest to me that the seek(0)
on data should already be called on the data when a retry occurs. So either the close()
is not called while this is assumed to happen, or the seek(0)
is not the solution for this issue (or, of course, I am missing something else here)
I cannot say why the situation occurs, but it doesn't surprise me that something can happen even after the data is sent, but before a success response comes back. Without the response, we can assume that the data isn't stored.
The unclosable thing was created for the case where the initial connection fails. The asyncio request function closes the input file-like anyway in this case, but we still want to read from it in the retry. seek(0) seems like a reasonable thing to do in any case. Having to pass a file-like in a the first place is strange (to me): apparently it makes for a more responsive event loop as asyncio can send the data in chunks.
--- a/gcsfs/core.py +++ b/gcsfs/core.py @@ -421,6 +421,8 @@ class GCSFileSystem(asyn.AsyncFileSystem): self, method, path, *args, headers=None, json=None, data=None, **kwargs ): await self._set_session() + if hasattr(data, "seek"): + data.seek(0) async with self.session.request( method=method,
If this is indeed all that's required to fix this issue, do you want to make a PR for this @martindurant , or would you prefer if I tried to do so?
Please do make a PR
NOTE: This issue is eerily similar to #290 , which was fixed at the time by @nbren12 in #380 . I have the creeping suspicion that the same issue might have arisen again, though I have been unable to suss it out just yet.Problem description
Every now and then, a request stemming from
simple_upload()
will fail on its first attempt, for some retriable reason, but then the retry will receive a non-retriable HTTP 400 error:Invalid multipart request with 0 mime parts.
.As the error message suggests, it appears that theEDIT: It seems I have misinterpreted the error message, and the issue is instead that the data stream is empty on the retry. I will keep remainder of the original post as-is for historical accuracy.Content-Type
header (as required by the docs) is missing (or empty?) on the retry, though it was evidently present on the initial attempt.Investigation
In my use case, the call originates from using the
xarray.Dataset.to_zarr()
function fromxarray
(which, interestingly, seems to be the use case for more people reporting similar issues. I suspect because the hard failure on the retry causes the upload to fail halfway though, leading to a corrupted Zarr store; but I digress). I have included the Traceback below, starting from the call toto_zarr
for the sake of completion. The more relevant bit starts when we enterfsspec
code with a call tofsspec.mapping.setitems()
which takes a dict of{path_as_str: data_as_bytes}
as only argument; This means it's a fairly basic call without potentially weird kwargs bleeding through and impacting the behaviour of fsspec/gcsfs, and so we can pretty much disregard anything related to xarray/zarr as a root cause.The most relevant code starts when we enter gcsfs with a call to
_pipe_file()
with only apath
anddata
parameter, meaning the rest of the arguments are the defaults (most notably:content_type="application/octet-stream"
). This leads to a call tosimple_upload()
, where the core of the request is created. Thecontent_type
parameter is used to generate a part of the body of the POST request, while the actual header of the call is set toheaders={"Content-Type": 'multipart/related; boundary="==0=="'}
.This leads to a call of
_request()
(via_call
), where we see the first attempt failed (for some unknown but retriable reason), and theretry_request
decorator fromgcsfs/retry.py
supposedly performs the exact same_request
call again, however this time it fails with the non-retriablegcsfs.retry.HttpError: Invalid multipart request with 0 mime parts., 400
, suggesting that the retried_request
did not have the same headers. However, I have failed to spot where this change might occur.Traceback:
Environment