Azure / azure-sdk-for-js

This repository is for active development of the Azure SDK for JavaScript (NodeJS & Browser). For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/javascript/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-js.
MIT License
2.03k stars 1.19k forks source link

blobclient.upload() method works fine in version 12.18.0 but same code breaks in 12.23.0 of lib @azure/storage-blob #30138

Open shashikumarpandey opened 2 months ago

shashikumarpandey commented 2 months ago

Describe the bug upload of json having special German characters works fine in library version 12.18.0 but remove few characters from last in lib version 12.23.0 version.

To Reproduce Steps to reproduce the behavior:

  1. Not working version of code
    async upload(tenantId: string, text: string, path: string): Promise<void> {
    const credential: AzureBlobCredential = await this.blobClientHelper.getCredential(tenantId);
    const blobConnectionString = await this.blobClientHelper.getConnectionString(credential);
    const blockBlobClient: BlockBlobClient = await this.blobClientHelper.getBobClient(
      blobConnectionString,
      credential.container_name,
      path
    );
    text = "german character containing special characters"; // use below highlighted data in bold and Italic
    await blockBlobClient.upload(text, text.length); // **this code works in 12.18.0 but not in 12.23.0**
    }

    _**'{\"plant\":\"1050\",\"routing\":\"100007245514\",\"routingType\":\"SHOP_ORDER\",\"version\":\"SYS001\",\"currentVersion\":true,\"status\":\"RELEASABLE\",\"description\":\"000000000002120998\",\"relaxedFlow\":false,\"entryRoutingStepId\":\"0100\",\"quantityValidation\":true,\"automaticGoodsReceipt\":false,\"routingSteps\":[{\"stepId\":\"0100\",\"sequence\":10,\"description\":\"Mat. Bereitstellung\",\"entry\":true,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-0100\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1900\"},\"reportingStep\":\"0100\",\"controlKey\":{\"controlKey\":\"ZO21\"},\"lastReportingStep\":false,\"rework\":false,\"queueDecisionType\":\"COMPLETING_OPERATOR\",\"erpSequence\":\"0\",\"nextStepList\":[\"0200\"],\"routingStepComponentList\":[{\"bomComponent\":{\"bom\":{\"bom\":\"100007245514-000000000002120998-1-1\",\"version\":\"SYS001\",\"bomType\":\"SHOPORDERBOM\",\"plant\":\"1050\"},\"material\":{\"plant\":\"1050\",\"material\":\"000000000000481023\",\"version\":\"SYS001\"},\"sequence\":10},\"sequence\":1,\"quantity\":40},{\"bomComponent\":{\"bom\":{\"bom\":\"100007245514-000000000002120998-1-1\",\"version\":\"SYS001\",\"bomType\":\"SHOPORDERBOM\",\"plant\":\"1050\"},\"material\":{\"plant\":\"1050\",\"material\":\"000000000002240856\",\"version\":\"ERP002\"},\"sequence\":20},\"sequence\":2,\"quantity\":27}]},{\"stepId\":\"0200\",\"sequence\":20,\"description\":\"Wkz voreinstellen, überprüfen\",\"entry\":false,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-0200\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1650\"},\"reportingStep\":\"0200\",\"controlKey\":{\"controlKey\":\"ZO21\"},\"lastReportingStep\":false,\"rework\":false,\"queueDecisionType\":\"COMPLETING_OPERATOR\",\"erpSequence\":\"0\",\"nextStepList\":[\"0400\"],\"routingStepComponentList\":[]},{\"stepId\":\"0400\",\"sequence\":30,\"description\":\"NC-Fr.: 1ter Step\",\"entry\":false,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-0400\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1594\"},\"reportingStep\":\"0400\",\"controlKey\":{\"controlKey\":\"ZO21\"},\"lastReportingStep\":false,\"rework\":false,\"queueDecisionType\":\"COMPLETING_OPERATOR\",\"erpSequence\":\"0\",\"nextStepList\":[\"0500\"],\"routingStepComponentList\":[]},{\"stepId\":\"0500\",\"sequence\":40,\"description\":\"NC-Fr.: 2ter Step\",\"entry\":false,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-0500\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1594\"},\"reportingStep\":\"0500\",\"controlKey\":{\"controlKey\":\"ZO21\"},\"lastReportingStep\":false,\"rework\":false,\"queueDecisionType\":\"COMPLETING_OPERATOR\",\"erpSequence\":\"0\",\"nextStepList\":[\"9000\"],\"routingStepComponentList\":[]},{\"stepId\":\"9000\",\"sequence\":50,\"description\":\"Abschluß Datenbank\",\"entry\":false,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-9000\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1717\"},\"reportingStep\":\"9000\",\"controlKey\":{\"controlKey\":\"ZO18\"},\"lastReportingStep\":true,\"rework\":false,\"queueDecisionType\":\"COMPLETING_OPERATOR\",\"erpSequence\":\"0\",\"routingStepComponentList\":[]}],\"customValues\":[],\"routingOperationGroups\":[{\"routingOperationGroup\":\"100007245514-0-0100\",\"description\":\"Mat. Bereitstellung\",\"routingOperationGroupSteps\":[{\"routingStep\":{\"stepId\":\"0100\",\"sequence\":10,\"description\":\"Mat. Bereitstellung\",\"entry\":true,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-0100\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1900\"},\"reportingStep\":\"0100\",\"controlKey\":{\"controlKey\":\"ZO21\"},\"lastReportingStep\":false,\"rework\":false,\"queueDecisionType\":\"COMPLETING_OPERATOR\",\"erpSequence\":\"0\",\"nextStepList\":[\"0200\"],\"routingStepComponentList\":[{\"bomComponent\":{\"bom\":{\"bom\":\"100007245514-000000000002120998-1-1\",\"version\":\"SYS001\",\"bomType\":\"SHOPORDERBOM\",\"plant\":\"1050\"},\"material\":{\"plant\":\"1050\",\"material\":\"000000000000481023\",\"version\":\"SYS001\"},\"sequence\":10},\"sequence\":1,\"quantity\":40},{\"bomComponent\":{\"bom\":{\"bom\":\"100007245514-000000000002120998-1-1\",\"version\":\"SYS001\",\"bomType\":\"SHOPORDERBOM\",\"plant\":\"1050\"},\"material\":{\"plant\":\"1050\",\"material\":\"000000000002240856\",\"version\":\"ERP002\"},\"sequence\":20},\"sequence\":2,\"quantity\":27}]}}]},{\"routingOperationGroup\":\"100007245514-0-0200\",\"description\":\"Wkz voreinstellen, überprüfen\",\"routingOperationGroupSteps\":[{\"routingStep\":{\"stepId\":\"0200\",\"sequence\":20,\"description\":\"Wkz voreinstellen, überprüfen\",\"entry\":false,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-0200\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1650\"},\"reportingStep\":\"0200\",\"controlKey\":{\"controlKey\":\"ZO21\"},\"lastReportingStep\":false,\"rework\":false,\"queueDecisionType\":\"COMPLETING_OPERATOR\",\"erpSequence\":\"0\",\"nextStepList\":[\"0400\"],\"routingStepComponentList\":[]}}]},{\"routingOperationGroup\":\"100007245514-0-0400\",\"description\":\"NC-Fr.: 1ter Step\",\"routingOperationGroupSteps\":[{\"routingStep\":{\"stepId\":\"0400\",\"sequence\":30,\"description\":\"NC-Fr.: 1ter Step\",\"entry\":false,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-0400\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1594\"},\"reportingStep\":\"0400\",\"controlKey\":{\"controlKey\":\"ZO21\"},\"lastReportingStep\":false,\"rework\":false,\"queueDecisionType\":\"COMPLETING_OPERATOR\",\"erpSequence\":\"0\",\"nextStepList\":[\"0500\"],\"routingStepComponentList\":[]}}]},{\"routingOperationGroup\":\"100007245514-0-0500\",\"description\":\"NC-Fr.: 2ter Step\",\"routingOperationGroupSteps\":[{\"routingStep\":{\"stepId\":\"0500\",\"sequence\":40,\"description\":\"NC-Fr.: 2ter Step\",\"entry\":false,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-0500\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1594\"},\"reportingStep\":\"0500\",\"controlKey\":{\"controlKey\":\"ZO21\"},\"lastReportingStep\":false,\"rework\":false,\"queueDecisionType\":\"COMPLETING_OPERATOR\",\"erpSequence\":\"0\",\"nextStepList\":[\"9000\"],\"routingStepComponentList\":[]}}]},{\"routingOperationGroup\":\"100007245514-0-9000\",\"description\":\"Abschluß Datenbank\",\"routingOperationGroupSteps\":[{\"routingStep\":{\"stepId\":\"9000\",\"sequence\":50,\"description\":\"Abschluß Datenbank\",\"entry\":false,\"routingOperation\":{\"stepType\":\"NORMAL\",\"operationActivity\":{\"operationActivity\":\"100007245514-0-9000\",\"version\":\"SYS001\"},\"weighRelevant\":false,\"automaticGoodsReceipt\":false,\"customValues\":[]},\"workCenter\":{\"workCenter\":\"1717\"},\"reportingStep\":\"9000\",\"controlKey\":{\"controlKey\":\"ZO18\"},\"lastReportingStep\":true,\"rework\":false,\"queueDecisionType\":\"COMPLETINGOPERATOR\",\"erpSequence\":\"0\",\"routingStepComponentList\":[]}}]}],\"modifiedDateTime\":\"2024-06-06T15:24:07.123Z\",\"createdDateTime\":\"2024-06-06T14:32:58.017Z\",\"tenantId\":\"2479be7b-eac8-4d50-aef9-64e32f1bcc27\"}'**

Expected behavior It should upload the json without corrupting the data with text length in byte with utf8 encoding it works in 12.23.0 version(see below code). but our issue is more towards how it was working in 12.18.0 version and why all of a sudden it is not working with plain text.length. Since it involves a huge amount of data that are impacted, we need this info.

  const lengthInBytes = Buffer.byteLength(text, 'utf8');
  console.log(`Length in bytes: ${lengthInBytes}`); 

Additional context Since this behavior is now changed it has impacted millions of record in our DB . also if you can explain with while passing text.length as a parameter in upload method . how it was handling it's text encoding . this will help us in understanding whether our old data is also corrupted or only data created between today and the day we upgrade our library are corrupted. please check the details for more information official Microsoft Support request number: 2406130030005435

xirzec commented 2 months ago

Looking into this, the method documentation does clearly state the length parameter is expected to be in bytes: https://learn.microsoft.com/en-us/javascript/api/@azure/storage-blob/blockblobclient?view=azure-node-latest#@azure-storage-blob-blockblobclient-upload

This has not changed since 12.18.0, but brings up an interesting question of why the issue was not previously seen. The main difference between the two versions in this issue is that 12.18.0 uses a package called @azure/core-http to make HTTP requests. This package in turn depends on node-fetch, which performs the actual call.

Looking at the source for node-fetch, we can see that they always calculate the Content-Length header and set it, as long as they are not given a stream:

https://github.com/node-fetch/node-fetch/blob/8b3320d2a7c07bce4afc6b2bf6c3bbddda85b01f/src/request.js#L250

Which means that node-fetch was overriding the Content-Length header set by the BlockBlobClient to be the correct byte length, even though the wrong value was passed in.

12.23.0 on the other hand uses @azure/core-rest-pipeline which only sets the Content-Length header when it is not already set by the client: https://github.com/Azure/azure-sdk-for-js/blob/b0b7d0f3b1d1c6763397be30b6cf94aca1d7dfcb/sdk/core/core-rest-pipeline/src/nodeHttpClient.ts#L102

This is arguably more correct but allowed your invalid length parameter to pass through to the underlying transport, presumably causing the Storage service to ignore bytes sent after the provided length.

One fix here would be to have BlockBlobClient check when given a string input if the length is correct or not and either throw an Error or silently correct the value. @EmmaZhu do you have a preference?

EmmaZhu commented 2 months ago

I can reproduce the issue with uploading string like: "Wkz voreinstellen, überprüfen".

As the change is in dependencies for all SDKs, we may need to have a general fix for all storage SDKs.

xirzec commented 2 months ago

@EmmaZhu to clarify, you believe that we should handle the case of invalid Content-Length for determinate size body values (e.g. strings, ArrayBuffers, and so on) being passed by the user to operations which have body length as a parameter across all Storage packages?

Do you think we should continue to silently correct the mistake as before or should we begin to throw so that developers know they are passing an incorrect value?

EmmaZhu commented 2 months ago

Hi @xirzec ,

The issue only happens with OAuth token credentials.

We have following code in the StorageSharedKeyCredentialPolicy to set correct body length value to Content-Length header:


if (
      request.body &&
      (typeof request.body === "string" || (request.body as Buffer) !== undefined) &&
      request.body.length > 0
    ) {
      request.headers.set(HeaderConstants.CONTENT_LENGTH, Buffer.byteLength(request.body));
    }

I think it might be better to keep behavior consistent between different types of credentials.

xirzec commented 2 months ago

Oh interesting. I didn't realize the shared key credential had this behavior inside of it. I'm assuming it was something to do with being able to sign the request body correctly?

At any rate, I agree we should do this in a place not tied to credential type, perhaps we could simply pull it out into its own policy?

timotheeguerin commented 1 month ago

Also got into this issue here https://github.com/microsoft/typespec/pull/3790, migrated to using Buffer.from and uploadData which seems to resolve the issue Note that the doc does show to use text.length to pass to this parameter https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-upload-javascript#upload-a-block-blob-from-a-string

Also not really sure why we even need to pass the content length and is not something the SDK can figure out from the text we pass