Azure / azure-sdk-for-js

This repository is for active development of the Azure SDK for JavaScript (NodeJS & Browser). For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/javascript/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-js.
MIT License
2.07k stars 1.19k forks source link

Crashing await on upload with metadata #25732

Closed jurijsk closed 1 year ago

jurijsk commented 1 year ago

Describe the bug Node execution crashes on await of blob.upload() at the wrong time and the wrong place. If you have 2 blobs to upload and one of them had metadata with '-' in the key. Process crashes on await on any of the promises, sending you into pretty lengthy debugging you never planned. Error: : RestError: The metadata specified is invalid. It has characters that are not permitted.

To Reproduce Steps to reproduce the behavior:

  1. Run the following code in Azure Function in azure env. (node: ~18, runtime: ~4, see packages below)
import { AzureFunction, Context, HttpRequest } from "@azure/functions"
import { BlobServiceClient } from "@azure/storage-blob";
const blob: AzureFunction = async function (context: Context, request: HttpRequest) {
    context.log("line 4: beginning");
    const blobServiceClient = BlobServiceClient.fromConnectionString(process.env.DataStorageConnectionString || 'missing connection string');
    let container = blobServiceClient.getContainerClient("public");
    await container.createIfNotExists();

    let firstClient = container.getBlockBlobClient("blob-test/first");
    let first = "first";
    let firstPromise = firstClient.upload(first, first.length, { blobHTTPHeaders: { blobContentType: 'text/plain' }, metadata: { "get-some": "troubles" } }); //'-' in the key cause the issue

    let secondClient = container.getBlockBlobClient("blob-test/second");
    let second = "second";
    let secondPromise = secondClient.upload(second, second.length);

    try {
        context.log("line 18: next line will cause node process to crash");
        await secondPromise;
        context.log("line 20: this line will not execute");
    } catch (error) {
        context.log("line 22: this line will not execute as well");
        return { status: 200, body: { "message": "line 26: will not execute" } }
    }
    context.log("line 25: this line can not be reached");

    try {
    await firstPromise;
    } catch(error){
        return { status: 200, body: { "error": "line 30: first promise await crashed" } }
    }
    return { status: 200, body: { "message": "line 32: all good" } }
    //output binding is confitured with "name": "$return"
};
export default blob;

Expected behavior

Log

2023-05-04T08:39:20Z   [Verbose]   Request successfully matched the route with name 'blob' and template 'blob'
2023-05-04T08:39:20Z   [Information]   Executing 'Functions.blob' (Reason='This function was programmatically called via the host APIs.', Id=097cb2e0-0946-4638-ba67-5324765a2d37)
2023-05-04T08:39:20Z   [Verbose]   Sending invocation id:097cb2e0-0946-4638-ba67-5324765a2d37
2023-05-04T08:39:20Z   [Verbose]   Posting invocation id:097cb2e0-0946-4638-ba67-5324765a2d37 on workerId:4d7b44cb-2f71-405f-907a-a786e4653ae1
2023-05-04T08:39:20Z   [Information]   Thu, 04 May 2023 08:39:19 GMT line 4: beginning
2023-05-04T08:39:20Z   [Information]   Thu, 04 May 2023 08:39:20 GMT  line 18: next line will cause node process to crash
2023-05-04T08:39:20Z   [Verbose]   Handling WorkerErrorEvent for runtime:node, workerId:node. Failed with: Microsoft.Azure.WebJobs.Script.Workers.WorkerProcessExitException: node exited with code 1 (0x1)
 ---> System.Exception: LanguageWorkerConsoleLog[error] Worker uncaught exception (learn more: https://go.microsoft.com/fwlink/?linkid=2097909 ): 
 RestError: The metadata specified is invalid. It has characters that are not permitted. RequestId:82457722-701e-005d-7f63-7e3c4b000000 Time:2023-05-04T08:39:20.0255402Z
        at handleErrorResponse (C:\home\site\wwwroot\node_modules\@azure\core-http\dist\index.js:3146:19)
        at C:\home\site\wwwroot\node_modules\@azure\core-http\dist\index.js:3082:49
        at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
        at async StorageRetryPolicy.attemptSendRequest (C:\home\site\wwwroot\node_modules\@azure\storage-blob\dist\index.js:14396:24)
        at async StorageClientContext.sendOperationRequest (C:\home\site\wwwroot\node_modules\@azure\core-http\dist\index.js:4865:31)
        at async BlockBlobClient.upload (C:\home\site\wwwroot\node_modules\@azure\storage-blob\dist\index.js:20707:20)
   --- End of inner exception stack trace ---
2023-05-04T08:39:20Z   [Verbose]   Attempting to dispose webhost or jobhost channel for workerId: '4d7b44cb-2f71-405f-907a-a786e4653ae1', runtime: 'node'
2023-05-04T08:39:20Z   [Verbose]   Disposing language worker channel with id:4d7b44cb-2f71-405f-907a-a786e4653ae1
2023-05-04T08:39:21Z   [Error]   Executed 'Functions.blob' (Failed, Id=097cb2e0-0946-4638-ba67-5324765a2d37, Duration=445ms)
2023-05-04T08:39:21Z   [Verbose]   No initialized worker channels for runtime 'node'. Delaying future invocations
2023-05-04T08:39:21Z   [Verbose]   Restarting worker channel for runtime: 'node'
2023-05-04T08:39:21Z   [Verbose]   Adding jobhost language worker channel for runtime: node. workerId:9ce28a21-7899-4607-9f28-2269b1b22ecc
2023-05-04T08:39:21Z   [Information]   Worker process started and initialized.

Additional context used npm packages

"@azure/core-util": "^1.3.1",
"@azure/functions": "^3.5.1",
"@azure/storage-blob": "^12.14.0",
xirzec commented 1 year ago

Hi @jurijsk, it sounds like you had a frustrating debugging experience. Thank you for filing an issue so we can try to improve the experience.

I'd like to better understand the trouble you are facing. await is a JS language feature not specific to the Storage SDK. Depending on your tsconfig.json settings for TypeScript it may or may not be present in the compiled JS. The behavior for await depends on if the promise passed to it resolves or rejects. In the case of a rejected promise, await will throw and if uncaught this will in turn cause the promise returned by the async function to reject.

It seems that Azure Functions is terminating the worker when your async function rejects, which I presume happens on the line await firstPromise.

What is confusing to me is the log ordering when compared to your code:

[Information]   next line will cause node process to crash, and the this request was not even a problem
[Information]   function: upload call.

I don't understand how this ordering could occur given that function: upload call should have been logged immediately when your function was run. Is it possible that Azure Functions doesn't guarantee the ordering of output logs? Perhaps you could include a timestamp when logging to see if this is the case?

In addition, something small that I noticed is you are missing an await before the call to container.createIfNotExists(); so the container is not guaranteed to exist before your uploads are kicked off.

Since your experience may be specific to Azure Functions, I encourage you to also file feedback directly with that team: https://github.com/Azure/azure-functions#issues--feature-requests

jurijsk commented 1 year ago

Hi @xirzec thank you for you reply. Its good you have someone looking at it with fresh set of eyes.

I've changed the code sample to make it more explainable and adress some of your notes. Log aslo update.

I comply to to ESNext so the JS code is one to one, except typing.

To clarify the execution results in 500 - Internal Server Error, which should not happed taking into account that awaits are in try/catch. I would expect responses with 200 from catch block.

It seems that Azure Functions is terminating the worker when your async function rejects, which I presume happens on the line await firstPromise.

No, not quite. Crash happens on await secondPromise; which is surrounded try/catch, but since the somewhere in library code, return { status: 200, body: { "message": "line 26: will not execute" } } will not execute. And mind you that this is not even the promise that should reject due t invalid character.

So the execution will not return to function code after await secondPromise;

Regarding the order of log messages, it can change from execution to execution, but since its all async requests under the hood probably.

You can run the code here: https://textjoint-api.azurewebsites.net/blob and I encourage you to run the code in your function app in azure.

Judging by the trace the exception originates from @azure\storage-blob but I'm open to reporting it else where.

xirzec commented 1 year ago

@jurijsk I looked into this a bit more and it seems that it has to do with Node's unhandled promise rejection behavior changing to throw in Node 15: https://nodejs.org/api/cli.html#--unhandled-rejectionsmode

If you attach a rejection handler to the first promise like

firstPromise.catch(() => context.log("first Promise rejected"));

You'll see log output like this:

line 4: beginning
line 18: next line will cause node process to crash
first Promise rejected
line 20: this line will not execute
line 25: this line can not be reached
success! {
  status: 200,
  body: { error: 'line 30: first promise await crashed' }
}

This is because when you await secondPromise the event loop is unblocked, firstPromise rejects but there is no handler since you haven't awaited it yet, so Node helpfully throws immediately and kills your function.

There are some ways to avoid this situation (outside of manually attaching handlers when kicking off a promise). The first way is to await promises as you kick them off:

const result = await firstClient.upload(...)
// do something with result
const secondResult = await secondClient.upload(...)

But clearly this isn't great for parallelism, so if you want to kick off both promises and wait for them at the same time you can do this:

const firstPromise = firstClient.upload(...)
const secondPromise = secondClient.upload(...)

try {
  const [result1, result2] = await Promise.all([firstPromise, secondPromise]);
  // do something with results
} catch(error){
  // one of the promises rejected!
  // you can use Promise.allSettled if you don't want to reject when one promise rejects
}

I hope this makes sense. It's too bad that Node doesn't actually include any log output that the throw is happening because of unhandled promise rejection when using the throw behavior.

github-actions[bot] commented 1 year ago

Hi @jurijsk. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

jurijsk commented 1 year ago

All right, I think I learned something today. Thank you @xirzec!

My mental model of what is going on under the hood with await/async is closer to reality now, but I should say it is still confusing. It is strange that you have to rely on callbacks, instead of kicking of something and dealing with the result later with try/catch.

Promise.all(...) might be the answer, but then again, if it throws, will I be able to catch it in my code? And then understanding which promise was rejected is kind of pain in the neck.

I guess I have to create some kind of awaitable wrapper that will do .catch internally and return object with result / reason, if storage API is not doing it for me. I still hope that better handling can be done out of the box, (see 'Expected behavior' in original post).

By the way, do you think the behavior can me changes with --unhandled-rejections set with languageworkersnodearguments?

Thank you, once again.

xirzec commented 1 year ago

All right, I think I learned something today. Thank you @xirzec!

Happy to have helped!

Promise.all(...) might be the answer, but then again, if it throws, will I be able to catch it in my code? And then understanding which promise was rejected is kind of pain in the neck.

It should be doable, though Promise.allSettled might be better for figuring out which promise failed. You may have to design some kind of task/queue abstraction around this to make it more ergonomic. In general, if you await (inside of a try/catch) an async operation as soon as you call it, you shouldn't run into unhandled promise rejections. The main time to be careful is when you potentially do anything that yields (such as awaiting something else or waiting on setTimeout) before doing the await.

I guess I have to create some kind of awaitable wrapper that will do .catch internally and return object with result / reason, if storage API is not doing it for me. I still hope that better handling can be done out of the box, (see 'Expected behavior' in original post).

I agree that there is more complexity than would be ideal and that Node's decision to throw on unhandled promise rejection is a bit questionable, but I have the concern that if we invented some new async pattern we would be introducing yet another abstraction that developers would have to understand and reason about how it interoperated with other promise-based contracts.

The main struggle I think comes from the fact that async/await was introduced after Promises became popular. The example you have given makes sense (I get a promise, then decide to handle it later), but it would look a little more odd to do without await:


const promise1 = doOperation();
const promise2 = doOtherOperation();

return promise2.then( () =>{
    console.log("p2 success!");
}, () => {
    console.log("p2 error!");
}).then(() => {
    // now "await" promise1
   return promise1.then(() => { 
     console.log("p1 success!");
   }, () => { 
     console.log("p2 error!");
   });
});

In the above code it's not unreasonable to expect the error handling for promise1 to occur sooner, and this is what the runtime does. Since the runtime can't know if you will eventually subscribe to the promise rejection and since promises capture any thrown error (including runtime ones like syntax errors!) people would have a terrible time understanding that their program failed because of some silently swallowed exception.

Unfortunately, once you have the nice syntactic sugar of async/await, it makes it much harder to have intuition about if promises are being subscribed to since you don't actually attach the handlers yourself as they are instead magicked out of the aether by the runtime.

If you look at how TypeScript codegens async/await on runtimes that do not support it, you'll see that it's actually a tiny state machine that uses generator functions.

By the way, do you think the behavior can me changes with --unhandled-rejections set with languageworkersnodearguments?

It's possible this would help achieve what you wanted, though it does seem like there is a performance tradeoff with using them.

github-actions[bot] commented 1 year ago

Hi @jurijsk. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

github-actions[bot] commented 1 year ago

Hi @jurijsk, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!