googleapis / google-cloud-node

Google Cloud Client Library for Node.js
https://cloud.google.com/nodejs
Apache License 2.0
2.91k stars 591 forks source link

Lingering buckets #968

Closed callmehiphop closed 8 years ago

callmehiphop commented 8 years ago

Occasionally I'll create a bucket for testing purposes and when I go to delete it, I'll get an error stating that the bucket can't be deleted because it contains files, if I attempt to list the files within it there aren't any. I've also seen this occur without modifying any of the ACLs.

I'm pretty sure this is an upstream error, however I'm opening this issue to verify that it isn't our client and to also raise visibility on this issue.

dhermes commented 8 years ago

I believe it's because the Cloud Storage API is eventually consistent.

stephenplusplus commented 8 years ago

Related: https://github.com/GoogleCloudPlatform/gcloud-node/pull/965#discussion_r45912325

callmehiphop commented 8 years ago

@dhermes would you mind elaborating on that?

dhermes commented 8 years ago

Let's say objects A, B and C are in a bucket.

If you delete A you'd expect that getting the list of objects in the bucket would return B and C. However, if you make the requests in rapid succession, the list of objects may be all of A, B and C.

If you wait long enough (i.e. eventually, i.e. eventual consistency), you'll get the "correct" answer, which is just B and C.

callmehiphop commented 8 years ago

Gotcha, that would explain it. How long does it generally take for everything to become eventually consistent? I've run into a couple of scenarios where I wasn't able to delete buckets for several days.

dhermes commented 8 years ago

I'm not a GCS expert, just observing things we've run into with gcloud-python system tests with flaky failures.

Several days is way outside the scope of what I was talking about. I meant on the order of seconds.

stephenplusplus commented 8 years ago

If something goes wrong during our system tests, we might end up with lingering buckets or files. But when we attempt to delete them, it's not always immediate (like during the same process that our tests run in); it's being denied from the Dev Console UI. The only way I've found around it is with gsutil: https://github.com/GoogleCloudPlatform/gcloud-node/pull/965#discussion_r45912979

I'm unsure how the CLI tool makes a request that the API honors that I can't replicate as a user in the Dev Console or a service account with our client library. It would be great to get some insight from the Storage team.

// @jgeewax

mziccard commented 8 years ago

Quoting from the docs:

DELETE Bucket operations can also be affected by the eventual consistency of list operations. DELETE Bucket operations are affected only when you delete all of the objects in a bucket and then immediately try to delete the bucket. In this case, the list of objects in the bucket might not immediately reflect the fact that the objects have been deleted and so the delete bucket operation fails. Delete operations on buckets that are already empty are strongly consistent: that is, if you delete an empty bucket and get a success response, any subsequent attempt to access the bucket will fail.

In gcloud-java integration tests we put bucket delete inside a loop that first lists and deletes files then tries to delete bucket, if that fails we loop again. We set a timeout to end the loop so this is not guaranteed to work 100% but should reduce the amount of errors.

stephenplusplus commented 8 years ago

Thanks for that! I stumbled on that blurb while researching this issue and put it into effect just a couple minutes ago.

jgeewax commented 8 years ago

/cc @rdayal. Apparently gsutil can do things that the UI can't? That's weird, right?

stephenplusplus commented 8 years ago

Here is a recreation of the failed delete process through the UI:

screen shot 2015-12-01 at 8 40 46 am screen shot 2015-12-01 at 8 40 54 am screen shot 2015-12-01 at 8 41 09 am screen shot 2015-12-01 at 8 41 16 am

And the bucket's metadata:

{
  kind: 'storage#bucket',
  id: 'gcloud-test-bucket-temp-02b799c0-9303-11e5-ae1a-fdbce319d2fc',
  selfLink: 'https://www.googleapis.com/storage/v1/b/gcloud-test-bucket-temp-02b799c0-9303-11e5-ae1a-fdbce319d2fc',
  projectNumber: '1046198160504',
  name: 'gcloud-test-bucket-temp-02b799c0-9303-11e5-ae1a-fdbce319d2fc',
  timeCreated: '2015-11-24T23:35:49.288Z',
  updated: '2015-11-24T23:35:49.288Z',
  metageneration: '1',
  owner: { entity: 'project-owners-1046198160504' },
  location: 'US',
  versioning: { enabled: true },
  storageClass: 'STANDARD',
  etag: 'CAE='
}
stephenplusplus commented 8 years ago

@rdayal is there more data we can provide that would make this easier to test? Have you heard of this before?

jgeewax commented 8 years ago

/cc @Capstan : Nathan -- any idea what's going on here ? Seems that the gsutil tool can do some things the UI won't do ?

rdayal commented 8 years ago

I haven't heard of this before, but I'm not an expert on the storage API. I do know that gsutil can do many things that the UI cannot. I've added Travis and Dan to comment.

@thobrla @dlorenc

Capstan commented 8 years ago

@jgeewax Yes. Eventual consistency will cause issues like this, there exist long tail issues with index consistency that we are actively pursuing. That said, it shouldn't be forever. Are you also deleting all the archived objects in this versioned bucket?

Capstan commented 8 years ago

And, no, apparently the storage browser doesn't handle versioned objects. :(

stephenplusplus commented 8 years ago

Thanks for taking a look! Hidden versioned objects never even occurred to me, that makes a lot of sense. I've put together a PR to delete all of the past versions of a files from each bucket we create before attempting to delete the bucket: #1147