googleapis / java-storage

Apache License 2.0
95 stars 68 forks source link

allow specifying the content length for resumable uploads #2511

Open benjaminp opened 1 month ago

benjaminp commented 1 month ago

Is your feature request related to a problem? Please describe. The resumable upload API allows specifying a final length of the object if known in a HTTP header. There is no way to pass this value into the Java GCS API for resumable uploads, though.

Describe the solution you'd like A Storage.writer method overload that takes the final content length.

BenWhitehead commented 1 month ago

In general we recommend folks use checksums for validation of all intended bytes reaching gcs rather than relying on the number of bytes. Using a checksum like crc32c will ensure the bytes are received in the correct sequence and match with what GCS receives.

For an example of how the crc32c precondition is provided to an upload, you can take a look at one of our integration tests that verify correct checksum plumbing and handling https://github.com/googleapis/java-storage/blob/50ac93b6b61806911737e389253739436dfb515c/google-cloud-storage/src/test/java/com/google/cloud/storage/it/ITObjectChecksumSupportTest.java#L203-L204

To compute a crc32c checksum you can use Guava's crc32c HashFunction https://guava.dev/releases/33.1.0-jre/api/docs/com/google/common/hash/Hashing.html#crc32c()

benjaminp commented 1 month ago

I agree that a checksum of the content is preferable when available. However, sometimes you are proxying a file from a client that gives you the size upfront but not a checksum in which case GCS verifying the size is better than nothing.