Azure / azure-storage-java

Microsoft Azure Storage Library for Java
https://docs.microsoft.com/en-us/java/api/overview/azure/storage
MIT License
189 stars 163 forks source link

The specified blob or block content is invalid - Azure blob #496

Closed Rameshkubendran closed 4 years ago

Rameshkubendran commented 5 years ago

Which service(blob, file, queue, table) does this issue concern?

Azure Blob

Which version of the SDK was used?

Version : Azure-storage 7.0.0 . Language : Java 8

Please note that if your issue is with v11, we are recommending customers either move back to v11 or move to v12 (currently in preview) if at all possible. Hopefully this resolves your issue, but if there is some reason why moving away from v11 is not possible at this time, please do continue to ask your question and we will do our best to support you. The README for this SDK has been updated to point to more information on why we have made this decision.

What problem was encountered?

We are getting the below exception when we are reading a blob. but its happening only for few documents alone.

Caused by: com.microsoft.azure.storage.StorageException: The specified blob or block content is invalid. at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87) at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:315) at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:185) at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1097) at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:1069)

Have you found a mitigation/solution?

I suspect this issue may happen when block id length are not same. So I have ensured that block ids of blocks are of same length. Still i am getting this issue. Block id logic is here.. String encodedId = Base64.getEncoder().encodeToString(String.format("%05d", blockNum).getBytes(Charset.forName("UTF-8")));

It will return always 8 char.

Example: Encoding blockId: MDAwNzc= Encoding length : 8

jaschrep-msft commented 4 years ago

Hi @Rameshkubendran, thank you for reporting this issue. My understanding is that any issues with block ID length would show up while you are still writing the blob, before you get a chance to read it.

Could you please share more information on how to reproduce this issue? How are you creating these specific documents and then how are you attempting to read them?

Rameshkubendran commented 4 years ago

Thanks for your response, Jaschrep. Yes, I am able to replicate this issue when i am rewriting the existing blob which is uploaded by single shot due to single upload since block id length is 60 chars. but we are trying to upload a block with block id length 8 chars which is causing this issue since we are recently made the changes to upload a huge file as block/chunk.

Initially I am not aware of that Azure will upload a single upload as a block internally with block id 60 chars. To resolve this issue, i am going ensure that block of the block id length is 60 chars. but Now i am just wondering that how to resolve this issue for those document which is uploaded by block id 8 chars with out deleting the that blob ?

Please advice on this.

jaschrep-msft commented 4 years ago

Regarding your concerns on block ID length, the REST documentation for put block states that Prior to encoding, the string must be less than or equal to 64 bytes in size. So you should be fine with encoding a 60 character string.

I'm confused as to what issue you are encountering. Your issue appears to be about failed reads, but the discussion seems to be all about writes. I'd like to better understand what is actually failing with the reads before working on a solution.

Rameshkubendran commented 4 years ago

I suspect this issue is happening while rewriting a blob since we are writing a block ( with block id 8 characters string ) as the blob is already exist with the block id with encoding a 60 character string.

Step 1 : Upload a block with block id 60 character string Step 2: Change a block id length to 8 character string. Step 3 : Now re-upload the blob Step 4: You will get the error as soon as you try to upload the 1st block.

Now I am doing the investigation on whether this issue is happening when uncommitted blocks id length are not same OR irrespective of committed and uncommitted blocks ?

jaschrep-msft commented 4 years ago

So your issue is with writes, and not with reads as initially stated. Thank you for clearing that up.

The service does not allow you to change a blockID size for a block blob once established. You must delete this blob if you wish to change the blockID size.

rickle-msft commented 4 years ago

@Rameshkubendran I am going to close this issue as I believe it has been confirmed to be a constraint coming from service behavior. If you need further support, please feel free to continue commenting here or open a new issue.