Closed vanguard737 closed 1 year ago
hmmm interesting. I did some light triage because this caught my eye.
BLUF: believe the issue is that pypicloud is trying to set properties that do not have setters for gcs using transport_method.
# from https://github.com/stevearc/pypicloud/blob/master/pypicloud/storage/gcs.py starting at line 207
with _open(
self.get_uri(package),
"wb",
compression="disable",
transport_params={
"client": self.bucket.client,
"blob_properties": {
"metadata": metadata,
"acl": self.object_acl,
"storage_class": self.storage_class,
},
},
) as fp:
for chunk in stream_file(datastream):
fp.write(chunk) # multipart upload
and after reviewing the following:
Appears to be straight forward:
pypicloud is passing "acl", "metadata", and "storage_class" which is resulting in these (ultimately) being passed to setattr
if blob_properties:
for k, v in blob_properties.items():
setattr(self._blob, k, v)
Reviewing the docs for GCS at https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.blob.Blob#google_cloud_storage_blob_Blob_storage_class What jumps out to me:
I'm just a passer-by whose attention was caught, and don't have bandwidth or use-case (I don't use the gcs backend at this time) to implement the changes. From my very brief perusal of the SDK docs though, it seems that the needed updates would be relatively minor (and in fact I suspect the correct usage is what's in v1.3.8 pre-smart_open, but haven't checked)
ok, I was interested to see what 1.3.8 had, and yep it's 90% of the way there.
def upload(self, package, datastream):
"""Upload the package to GCS"""
metadata = {"name": package.name, "version": package.version}
metadata.update(package.get_metadata())
blob = self._get_gcs_blob(package)
blob.metadata = metadata
blob.upload_from_file(datastream, predefined_acl=self.object_acl)
if self.storage_class is not None:
blob.update_storage_class(self.storage_class)
so recommended changes:
P.S. did some more looking at the acl, and appears like using argname "predefined_acl" instead of "acl" might be all that's needed?
Thanks @nivintw for the investigation! The root cause you mentioned sounds plausible. It'll unfortunately be a while before I can try to fix this. I'm on the road for the next couple weeks, but I'll get to it when I can.
cc @ddelange
Thanks for the cc, and sorry that this slipped through. I think this needs upstream PRs.
setattr(bucket, "storage_class", storage_class)
before uploading (docs, this behaviour is local only ref)
blob.update_storage_class
which will create a new blob and copy over the contentsblob.create_resumable_upload_session
, which (as only function of the bunch!) hard-codes predefined_acl=None
.
blob.upload_from_file(datastream, predefined_acl=self.object_acl)
won't work, because for that one to perform a multipart upload, the size
kwarg needs to be not None (and we don't know the total size of datastream
at this point)Thanks @ddelange for jumping on this so quickly!
Hi π
Quick update: we're looking at a major version bump from smart_open from a PR opened 2 days after mine: https://github.com/RaRe-Technologies/smart_open/pull/729
As it changes internals significantly, my original PR at gcs became unnecessary for this issue. Also, instead of updating our mocks to account for new smart_open internals, I switched to fake-gcs-server analogous to switching to azurite in #304.
And that is working like a charm (as we can now actually assert ACL being returned on an uploaded blob, instead of mocking which caused this issue in the first place): I've tested the fix here using unreleased smart_open
, CI is green :)
So now we wait for smart_open to release 7.0.0! π
Hi,
I'm running
pypicloud
within Google App Engine, usinguWSGI
and with GCS as the storage backend. Onpypicloud
1.3.8 and earlier, this works fine. However, after upgrading topypicloud
1.3.9 or later, I can no longer upload packages (either viapoetry publish --repository <my_repo_name>
, or via the web UI). I observed the errors below in/var/log/pypicloud.log
.Given that this behavior manifested in 1.3.9, and that the errors are coming up from
smart_open
, PR #304 seems possibly relevant. In meantime, workaround was just to roll back to 1.3.8.Thanks, and let me know if you need any further debugging info - C.J.