BrianRosamilia / azure-function-log-intercept

Add console.log support to Azure Functions
27 stars 2 forks source link

question about function instances #4

Open karpikpl opened 4 years ago

karpikpl commented 4 years ago

Hey @BrianRosamilia, thanks for creating this package.

I was just wondering what would happen if azure function is called multiple times? Could it re-use existing instance or is it always going to create new one?

Because if there's a chance of 2 (or more) triggers executing - wouldn't they override the context and global console.log?

BrianRosamilia commented 4 years ago

Hi @karpikpl thank you for thanking me :)!

You are right to be concerned about this and yes, if you are using an HTTP trigger and have a decent # of requests, the global console.log is going to be overridden multiple times from multiple different function invocations.

I considered this and I was careful not to let this cause an issue. I have a unit test making sure this exact thing scenario doesn't cause context.log to be called multiple times when you intend to call it a single time via console.log.

https://github.com/BrianRosamilia/azure-function-log-intercept/blob/master/__tests__/index.js#L19-L34

Technical explanation : It's true that it keeps creating a new console.log, but it's always creating the exact same one and in a way that wont 'nest' the reassignments, since it holds onto the original console.log before any code to reassign it executes. No matter how many times its called, it always points to exactly one context.log and one console.log.

karpikpl commented 4 years ago

I see. I guess I was more concerned about logs being mixed up. My understanding is that ctx.log logs the function invocation (in app insights). With many functions in one function app and many executions - wouldn't that cause hijacking of console.log before call is complete?

BrianRosamilia commented 4 years ago

It could, but would it matter? AFAIK context.log does nothing special to trace back to the original function or specific function execution it got invoked with. So really all that matters, as I see it, is that it logs only once each time.

If you want to help me test, make a function app with a n second cron function and a n*2 second cron function; they'll eventually step on each other after running for a bit. Feel free to upload the test project here, I'd really appreciate it!

karpikpl commented 4 years ago

I'll try to test it this week. I was basing my assumptions on this: https://docs.microsoft.com/en-us/azure/azure-functions/functions-reference-node#writing-trace-output-to-the-console

specifically:

In Functions, you use the context.log methods to write trace output to the console. In Functions v2.x, trace outputs using console.log are captured at the Function App level. This means that outputs from console.log are not tied to a specific function invocation and aren't displayed in a specific function's logs. They do, however, propagate to Application Insights.

hence, I was thinking that outputs from context.log are tied to a specific function invocation and are displayed in a specific function's logs.

BrianRosamilia commented 4 years ago

Wow that's an interesting find from their docs 👍

That text has been there since before I thought to make this package and I think it's incorrect, especially with regard to console.log sending any data at all to Application Insights.. (If it were correct, you wouldn't be here talking to me after downloading this package! :D)

What I think they could be referring to in that paragraph, is the logs from the log stream (in the cloud CLI-looking window they show you when you invoke on the server, and in your local azure function emulator console). That's probably why they're referring to it as a trace output. I think the Microsoft docs just have a mistake in them.

Here's a link to that document from January

https://github.com/MicrosoftDocs/azure-docs/blob/bbe95d6ac4ec7706eec664117ebd0c56b99090ac/articles/azure-functions/functions-reference-node.md

Also, I found this. It's actually more horrifying than just succumbing to passing around context.log /shrug

https://www.npmjs.com/package/applicationinsights#azure-functions-example

I don't understand why Microsoft is spending so much energy ensuring logs for a single request stay together. AWS and GCP dont try to do this. The tools to query logs are incredibly powerful. Log something like a user Id or entity Id along with your data (which you probably need to do anyway to get meaningful logs). This covers 99% of use cases.

karpikpl commented 4 years ago

Hey, I appreciate the discussion ;)

My guess is that they are using invocation id to group logs together in App Insights. Logging with cloud watch is so much easier ;) but you do need to care about the correlation id - which is not that different than caring context.log everywhere.

So here's what happens in app insights if console.log is overwritten by context.log. This is the transaction view - which should only show logs from one invocation. image

here's the code for the http function:

module.exports = async function (context, req) {

    console.log('Hi. Im logging to console!', context.invocationId);
    console.log = context.log;
    console.log('This is eperiment function 1', context.invocationId);

    // sleep for 1 s
    await new Promise(resolve => setTimeout(resolve, 4000));

    console.log('This is eperiment function 2', context.invocationId);
    context.res = {
        body: 'hi'
    };
}

and here's one for the timer based:

module.exports = async function (context, myTimer) {
    var timeStamp = new Date().toISOString();

    if (myTimer.IsPastDue)
    {
        context.log('JavaScript is running late!');
    }

    console.log('Hi. Im logging to console from Timmer!', context.invocationId);
    console.log = context.log;
    console.log('This is Timer function 1', context.invocationId);

    // sleep for 1 s
    await new Promise(resolve => setTimeout(resolve, 1000));

    console.log('This is timer function 2', context.invocationId);
};

(I made a bunch of typos, sorry for that)

So the logs from console go to appInsights but are not tied to operation/invocation. Logs from context go to operation/invocation and are grouped together.

BrianRosamilia commented 4 years ago

I think your test proves what we suspected, that there will be overlap to some degree. I will perhaps add a disclaimer for this to my readme. At the very least it's good that it's documented in our chat!

And just to confirm, you aren't able to get logs with just plain console.log right? I suggest you send a PR to the documentation I linked that you originally found. It's just flat out wrong. (If it were right, you'd see double logs when using my package, so yeah I don't know why they claim console.log surfaces in Application Insights).

If you don't want to PR their repo then no problem, I will do it on your behalf. Just lmk!

BrianRosamilia commented 4 years ago

I forgot to tag you @karpikpl and also I updated my comment 15 minutes after editing it so the comment you got emailed from me had a stupid mistake in it! Sorry! :P

karpikpl commented 4 years ago

Hey, sorry for the delay, last 2 days were busy at work. Here's the full screenshot - I got it by searching for logs in application insights. App insights were connected to the function during deployment (via env variable for instrumentation key)

image

I'm able to get logs via plain console log, the look like this: image

if context.log is used - than log statements are part of the transaction (this is after console.log = context.log assignment) image

Sorry - I didn't mean to "break" your package. It's great that people share via OS :) You motivated me to figure out how those logs work.

BrianRosamilia commented 4 years ago

Thanks for the reply @karpikpl and I'm glad you took a deep dive on this!

And it's no problem at all! I suspected this could happen. And if not now, then in the future because why would Microsoft have such an awkward API if not for a feature like this so it was only a matter of time before someone found the precise leak in my abstraction :D

To to clarify my question

console.log works but you still needed to do the reassignment, correct?

console.log = context.log;

I just want to make sure I'm not missing something and all of the sudden stdout is shipped to appinsights making my package deprecated--which is something that could happen in the future or it could literally be a checkbox somewhere in Azure functions. (Of course I would have to confirm this myself before doing anything about this!)

Thanks again!

karpikpl commented 4 years ago

console.log works without any changes

module.exports = async function (context, myTimer) {
    var timeStamp = new Date().toISOString();

    if (myTimer.IsPastDue)
    {
        console.log('JavaScript is running late!');
    }

    console.log('Log statement one - XYZ', context.invocationId);
    console.log('Log statement two - XYZ', context.invocationId);

    // sleep for 1 s
    await new Promise(resolve => setTimeout(resolve, 1000));

    console.log('Log statement three - XYZ', context.invocationId);
};

Result: (each log statement is displayed separately) image

joeyjiron06 commented 3 years ago

Hi, thanks for putting this package together. So I just want to make sure i understand this correctly. We have a race condition here because, logging could be wrong in a scenario like this:

  1. func is invoked & console.log is overridden using contxext1
  2. this function fires off an async request
  3. While the first function is running, the function gets invoked again and set the console.log to the new context context 2
  4. the first function call returns from it's async request and logs something

Actual Behavior The log is tied to context 2 since it was the last one to set the global state of console

Expected Behavior The log should be tied to context 1 since we want to log for this specific instance

Am I missing something?

BrianRosamilia commented 3 years ago

@joeyjiron06

That's exactly correct. I see no way around this, unfortunately. If you need deep application insights dependency tracing or log tracing, this package may not be the right choice and you may need to just pass down the Context object everywhere. A runtime agnostic solution would be to just generate your own GUID (or use another identifier you can rely on) on the start of your own code, and put that identifier on something you pass through the calls of your function (but of course this won't be as easily surfaced in Application Insights on the Azure portal).

My package is a semi-leaky abstraction just to evade passing the proprietary Context object around in all of your business logic. It pains me to consider doing that vs a workaround like this package that enables a more 'vanilla.js' approach :) but my package is not the best in EVERY case, for sure.

(If someone eventually figures out a way around this, I'm interested in adding it btw)

joeyjiron06 commented 3 years ago

Hi @BrianRosamilia, thanks for the quick response. I have run into this issue in the past running a standard nodejs server. In my research, I had seem some really amazing software out there that would allow us to do contextual logging by using an experimental feature in NodeJs called Async Hooks.

The issue with that approach is that's it's experimental and could have an impact on performance as described here.

I may try to take a stab at an implementation using a library like cls-hooked and somehow lookup console.log based on the current context that is being executed. I'll run some tests and check performance.

Just thought I'd share the research i've found as this async problem has plagued me for years lol. I have also used dependency injection in the past, but that requires you to write your code and modules a certain way which is less than ideal. I love your simple approach and would like to find a similar simple solution.

More reading here

BrianRosamilia commented 3 years ago

@joeyjiron06 very interesting. We can just put support for this behind an environment variable if you can get something working :) I appreciate your effort + communication!!

ronkot commented 5 months ago

Just to add my 2 cents. I also once tried to use a global logger in azure functions but the traces were not correctly associated to right function invocations. I also tried to use global setting for request-specific internationalization which ended up very badly. I was baffled as exactly the same approach worked in AWS Lambdas. Finally I found out that AWS Lambdas and Azure functions have totally different execution model:

AWS Lambda: Use one execution context (node instance) for the full duration of function call. Do not process parallel requests using the same context, always launch a new context if there're no free contexts available. --> It's totally ok to use global variables, other requests won't change them during execution of one request. So I can succesfully use e.g. a global intance of pino-lambda logger that logs content and adds the correct aws request id for each log entry.

Azure functions: Share the execution context i.e. serve multiple parallel requests using the same context. --> One just cannot use global variables. Only possibility to have log rows associated to correct invocation is to pass exectution context (or logger that wraps it) around as function arguments. That's what I ended up doing. It's ugly but the only way that really works.