Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.35k stars 1.99k forks source link

[BUG] HTTP 416 when trying to decrypt client when raw file is a multiple of 4M-bytes by Encryption Protocol V2 #41709

Closed StephanK95 closed 4 weeks ago

StephanK95 commented 2 months ago

Describe the bug We are facing issues when trying to decrypt data when the raw file is exactly a multiple of 4M-Bytes (4194304 Bytes). For example if we encrypt a file with file size 33.554.432 we are able to encrypt it succesfully but when trying do download and encrypt it we receive a 416 Range invalid answer. The Bug is reproducable with syntetic files of that size. It seems that the problem relates to a wrong calculation of the adjusted blobrange in class EncryptedBlobRange in case of using V2-Encryption.

`

            // Calculate offsetAdjustment.
            // Get the start of the encryption region for the original offset
            long regionNumber = originalRange.getOffset() / GCM_ENCRYPTION_REGION_LENGTH;
            long regionStartOffset = regionNumber
                * (NONCE_LENGTH + GCM_ENCRYPTION_REGION_LENGTH + TAG_LENGTH);

            // This is the plaintext original offset minus the beginning of the containing encryption region also in plaintext.
            // It is effectively the amount of extra plaintext we grabbed. This is necessary because the nonces and tags
            // are stored in the data, which skews our counting.
            this.amountPlaintextToSkip = originalRange.getOffset() - (regionNumber * GCM_ENCRYPTION_REGION_LENGTH);

            if (originalRange.getCount() != null) {
                // Get the end of the encryption region for the end of the original range
                regionNumber = (originalRange.getOffset() + originalRange.getCount() - 1)
                    / GCM_ENCRYPTION_REGION_LENGTH;
                // Read: Get the starting offset for the last encryption region as above and add the length of a region
                // to get the end offset for the region
                long regionEndOffset = (regionNumber + 1)
                    * (NONCE_LENGTH + GCM_ENCRYPTION_REGION_LENGTH + TAG_LENGTH);
                // adjusted download count is the difference in the end and start of the range.
                this.adjustedDownloadCount = regionEndOffset - regionStartOffset;
            }

            // Offset adjustment is difference in two starting values
            this.offsetAdjustment = (int) (originalRange.getOffset() - regionStartOffset);

`

For Example (33.554.432 File):

Original-Range: bytes=33554432-33554656

regionNumber = 8 regionStartOffset = 33554656 amountPlaintextToSkip = 0 adjustedDownloadCount = 4194332

offsetAdjustment = -224 (It seems to be wrong that this value can be negative)

That leads with EncryptedBlobRange.toBlobRange()

`

   BlobRange toBlobRange() {
            return new BlobRange(this.originalRange.getOffset() - this.offsetAdjustment, this.adjustedDownloadCount);
    }

`

to adjusted range bytes=33554656-37748987 which is set afterwards as header.

This leads to the shown exception.

Exception or Stack Trace com.azure.storage.blob.models.BlobStorageException: Status code 416, "<?xml version="1.0" encoding="utf-8"?>InvalidRangeThe range specified is invalid for the current size of the resource.

To Reproduce Upload and encrypt a file with size 33.554.432 and afterwards try to decrypt it using an EncryptedBlobClient.

Code Snippet See snipped above

Expected behavior File Should be decryptable.

Screenshots If applicable, add screenshots to help explain your problem.

Setup (please complete the following information):

If you suspect a dependency version mismatch (e.g. you see NoClassDefFoundError, NoSuchMethodError or similar), please check out Troubleshoot dependency version conflict article first. If it doesn't provide solution for the problem, please provide:

Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

github-actions[bot] commented 2 months ago

@ibrahimrabab @ibrandes @kyleknap @seanmcc-msft

github-actions[bot] commented 2 months ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

joshfree commented 2 months ago

@alzimmermsft could you assist the Storage team as you have cycles?

ibrandes commented 2 months ago

Hey @StephanK95,

We tried to reproduce this on our end with a V2 client but didn't experience the issue. Here's what we did:

        File file = File.createTempFile(CoreUtils.randomUuid().toString(), ".txt");
        file.deleteOnExit();
        Files.write(file.toPath(), getRandomByteArray(33554432));

        bec.uploadFromFile(file.toPath().toString());

        ByteArrayOutputStream outStream = new ByteArrayOutputStream();
        bec.downloadStream(outStream);

Could you share a code snippet from your end of the upload/download calls and the client construction so we can try to reproduce it more accurately?

Stefan-R-Acc commented 2 months ago

Hi @ibrandes,

we use client side encryption link So you have to use a EncryptedBlobClient to reproduce it.

EncryptedBlobClient encryptedBlobClient = new EncryptedBlobClientBuilder(encryptionVersion).key(akek, KeyWrapAlgorithm.RSA1_5.toString()) .blobClient(getBlobClient(containerName, blobName, azureStorageEndpoint)) .buildEncryptedBlobClient();

where encryptionVersion is com.azure.storage.blob.specialized.cryptography.EncryptionVersion.V2 and akek is of type com.azure.core.cryptography.AsyncKeyEncryptionKey.

In our project we use a keyvault to wrap/unwrap the EncryptionKey. But for reproduce you can just use a no-op key by doing something like:

public static AsyncKeyEncryptionKey getLocalAsyncKeyEncryptionKeyForTest(String keyId) {

        return new AsyncKeyEncryptionKey() {

            @Override
            public Mono<String> getKeyId() {
                return Mono.just("/keys/" + keyId);
            }

            // this will be used while encryption process
            @Override
            public Mono<byte[]> wrapKey(String s, byte[] bytes) {
                return Mono.fromCallable(() -> bytes);
            }

            // this will be used while decryption process
            @Override
            public Mono<byte[]> unwrapKey(String s, byte[] bytes) {
                return Mono.fromCallable(() -> bytes);
            }
        };
    }

We reproduced it against a real Azure Storage Account, not checked/tried out with Azurite.

ibrandes commented 2 months ago

Hi @Stefan-R-Acc,

Thanks for the code snippet. We were able to reproduce the issue and are working on a fix. We will update you when we have more information.

ibrandes commented 2 months ago

Hi @Stefan-R-Acc,

Thanks for your patience. We are still working on a fix, but you can use EncryptedBlobClient.downloadStream(OutputStream stream) for now as a workaround. The issue has to deal with specifically EncryptedBlobClient.downloadToFile and how the range headers are calculated for partitioned downloads, and as EncryptedBlobClient.downloadStream doesn't use partitioned downloads, you shouldn't run into the issue when the raw file is exactly a multiple of 4M-Bytes. The code sample from my initial reply can be a good example of how to use this method:

Files.write(file.toPath(), getRandomByteArray(33554432));

client.uploadFromFile(file.toPath().toString());

ByteArrayOutputStream outStream = new ByteArrayOutputStream();
client.downloadStream(outStream);
Stefan-R-Acc commented 1 month ago

Hi @ibrandes ,

Thanks for your reply. Yes i checked it with downloadStream(...) and this one allows to decrypt the blob into a outputstream - for example download to file. I think, i am able to build a workaround in our code for 4mb-multiple-cases until your bugfix is finished. Unfortunately we lose with the workaround the possibility to chain streams together and need to download the blob first to server storage and upload to target blob, which is quite resource stressful when working with blob sizes in gigabyte range.

So hopefully the bugfix is in place soon. Any information about planned timeline so far?

ibrandes commented 1 month ago

Hi @Stefan-R-Acc,

We don't have an exact timeline yet, but we are prioritizing this bugfix. It probably won't make it into next week's release; however, it should be implemented within the next month.

ibrandes commented 4 weeks ago

Hi @Stefan-R-Acc,

Just wanted to let you know that we released a patch with the fix to this issue in it. Here are the links:

https://central.sonatype.com/artifact/com.azure/azure-storage-blob/12.28.1 https://central.sonatype.com/artifact/com.azure/azure-storage-blob-cryptography/12.27.1 https://central.sonatype.com/artifact/com.azure/azure-storage-common/12.27.1

Stefan-R-Acc commented 2 weeks ago

Thanks @ibrandes,

we will try it out :-)