googleapis / google-cloud-ruby

Google Cloud Client Library for Ruby
https://googleapis.github.io/google-cloud-ruby/
Apache License 2.0
1.36k stars 550 forks source link

Expose objects.rewrite in File #1196

Closed quartzmo closed 7 years ago

quartzmo commented 7 years ago

As a follow-up to #1195, which adds File#rotate based on objects.rewrite, I would like to expose the full power of objects.rewrite in File. It may make the most sense to replace the implementation of File#copy with something that uses objects.rewrite instead of objects.copy, since on the server objects.copy seems to just be a single-request version of objects.rewrite under the hood. If so, File#copy could be expanded to accept a block to monitor and even control the multiple-request behavior used now in File#rotate. It could also accept options for rotating keys, just like File#rotate.

Comments/discussion welcome.

blowmage commented 7 years ago

I agree that rewrite looks to be a better copy. From a Ruby API perspective, I think it makes a lot of sense to keep the existing API of File#copy and add the new parameters and swap out the backend implementation.

danoscarmike commented 7 years ago

Ping on this. If implemented will it amount to a breaking change? If so, please apply Status: Release Blocking label.

bjwatson commented 7 years ago

Also, please bump this up to Priority: P1 if this requires a breaking change. If it can be done later in a non-breaking manner, then let's leave this alone for now.

blowmage commented 7 years ago

I don't think this would be a breaking change.

blowmage commented 7 years ago

In the same way the Ruby API for Storage doesn't expose either of the object.update and object.patch API calls, I don't see a reason to change the Ruby API to take advantage of object.rewrite where it is using object.copy today. The Ruby API can take advantage of object.rewrite without exposing the fact.

quartzmo commented 7 years ago

Per the request in #1328, a new implementation of copy should also support Changing Object Storage Classes at the same time that an object is copied to a new location.

The existing File#storage_class= method immediately changes the storage class on the service (a rewrite operation), which means that rewriting to change the storage class and then rewriting to move to a new location requires two rewrite operations (doubling the I/O charges, I believe.)

I see two design approaches to resolve this:

  1. Change the implementation of File#storage_class= to store the new state locally until File#copy is called:
require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-bucket" # API call

file = bucket.file "path/to/my-file.ext" # API call
file.storage_class = :nearline # Proposed: local change only (currently: I/O charge)
file.copy "path/to/destination/my-file.ext" # I/O charge

or

  1. Leave File#storage_class= as is, and add a storage_class option to File#copy:
require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-bucket" # API call

file = bucket.file "path/to/my-file.ext" # API call
file.copy "path/to/destination/my-file.ext", storage_class: :nearline # I/O charge

@blowmage @hiroshi Any opinion on these options?

blowmage commented 7 years ago

☑️ for option 2.

blowmage commented 7 years ago
  1. yield a file object that you can make changes to as part of the copy:
require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-bucket" # API call

file = bucket.file "path/to/my-file.ext" # API call
file.copy "path/to/destination/my-file.ext" do |f|
  f.storage_class = :nearline # I/O charge
  f.metadata["player"] = "Bob" # metadata change
  f.metadata["score"] = "10"
end
quartzmo commented 7 years ago

@blowmage I think 3. is the best solution, because Objects.rewrite is a POST that accepts the entire Object representation, including metadata. Do you agree? Any concerns?

Capstan commented 7 years ago

As a warning, running storage.objects.copy in a way that changes locations or storage classes runs the risk of failing. The only reason we didn't remove it is because we'd have had to change all client libraries out there and go through a whole deprecation period. GCS prefers that google-cloud-x libraries (re)implement their own copy exposure via rewrite.

blowmage commented 7 years ago

@Capstan perfect, that was our plan.

blowmage commented 7 years ago

@quartzmo Let's go with option 3 then.

quartzmo commented 7 years ago

@Capstan Thanks for confirming our hunch on this! @blowmage Will do option 3.