Azure / azure-functions-host

The host/runtime that powers Azure Functions
https://functions.azure.com
MIT License
1.92k stars 442 forks source link

Pass log-levels for Functions to the Language Worker #6879

Open Hazhzeng opened 3 years ago

Hazhzeng commented 3 years ago

What problem would the feature you're requesting solve? Please describe.

Recently, we face an issue on the Python worker side when sending DEBUG level log to function host via GRPC channel. However, some third-party Python libraries abuse the DEBUG level (e.g. azure-servicebus). In azure-servicebus library, it sends 80+ DEBUG level log entries clogging the GRPC channel, stopping the worker from processing any future messages.

This issue occurs even before host can filtering the log messages.

Describe the solution you'd like

We should disable the worker log from its source.

When starting the worker process, the host can pass a --log-level <level> flag to the worker.exe process, and let the worker handle which level of log should be collected.

Describe alternatives you've considered

@pragnagopa mentions that in host.json, the user is able to control log level in each function and could be hard to control. @vrdmr suggests the host can do a minimal aggregation for the log level min(Function1LogLevel, Function2LogLevel). @Hazhzeng suggests the host can pass the fine grain log level information and let the worker decides how to handle logging in each functions.

Additional context

cc: @anirudhgarg @brettsam @fabiocav @stefanushinardi

pragnagopa commented 3 years ago

Assigning to next sprint to discuss the right approach.

In dotnet functions, LogLevels specified in host.json are appropriately converted and right logging filters are applied properly for system logs and user logs. We need figure out a design to extend similar capability for language workers.

Issue above describes - how python worker cannot apply logging filters taking into consideration loglevels specified in host.json

brettsam commented 3 years ago

This sounds a lot like https://github.com/Azure/azure-functions-host/issues/4815. The recommendation I had there is to allow the host.json config to be passed down to the worker on startup -- and then each worker can handle it however they want.

pragnagopa commented 3 years ago

Thanks @brettsam . Yes, this is same as #4815 .

I agree I do not think we can support similar logging filter per category for out-of-proc but would be great if can land on a simple approach that can be implemented in the host so that all language workers can use this without parsing host.json

For advanced scenarios and more granular control over log, language worker should implement its own log filters.

fabiocav commented 3 years ago

Leaving this unassigned for now. We'll be working on the design with @Hazhzeng and team.

anthonychu commented 3 years ago

Not sure if this affects custom handlers but in case it does, want to point out that we should ensure this flag isn't passed as an argument to custom handler processes as some cannot handle unexpected arguments.

fabiocav commented 3 years ago

Assigning to sprint 90 as a stretch goal.

fabiocav commented 3 years ago

Assigning to sprint 91, pending an owner as @pragnagopa won't be available.

fabiocav commented 3 years ago

@kashimiz assigning to sprint 93, but please let me know if we need to change.

pragnagopa commented 3 years ago

More details on implementation:

Note: Send AppInit message just before function load message

https://github.com/Azure/azure-functions-host/blob/dev/src/WebJobs.Script/Workers/Rpc/FunctionRegistration/RpcFunctionInvocationDispatcher.cs#L118

This message cannot be sent earlier as host.json from user code will not be available until after specialization.

Hazhzeng commented 3 years ago

More details on implementation:

  • Worker should opt-in to new capability on the host hostLogLevels
  • Add new message AppInit in proto file -- with a map of loglevel defined in host.json - default would be info - Generic message like AppInit will help us extend this future if needed.
  • Check capability hostLogLevel on workerInitResponse
  • If worker opts in - only then send HostLogLevel message - Need to check capability as worker will throw errors for unknown messages

Hey @pragnagopa, yeah, just to confirm, to actual send the HostLogLevel to worker, we should do it before FunctionsLoadRequest, right? As you mentioned, in placeholder mode, function host will not have function context during worker init.

In this case, do we suppose to have another GRPC event to send the HostLogLevel (e.g. HostLogConifgRequest)? Or should we just do it in FunctionsLoadRequest?

pragnagopa commented 3 years ago

In this case, do we suppose to have another GRPC event to send the HostLogLevel (e.g. HostLogConifgRequest)? Or should we just do it in FunctionsLoadRequest?

FunctionLoadRequest will be sent multiple times - one per function - this would increase the message size. Defining a HostLogLevel message or something more generic that can be sent before FunctionLoadRequest is better.

pragnagopa commented 3 years ago

Updated the comments to reflect a more generic message - AppInit

fabiocav commented 3 years ago

If we want a generic message, the name AppInit feels a bit odd.

Had some comments about the approach, but I think a chat would actually help.

brettsam commented 3 years ago

My main question here -- how will you determine the log-level to pass to the worker when there's a bunch of category-specific levels? Just one example, but a customer could have default set to none but Worker (which captures any category starting with Worker) set to trace.

fabiocav commented 3 years ago

this work requires additional design. Moving to triaged until this is ready.

vrdmr commented 2 years ago

Bringing this up as @YunchuWang is working on the implementation to send debug logs if the Default log level is set to Debug or lower.

@brettsam @fabiocav @liliankasem - Looking at FunctionRpc.proto#L107-L109, I see that there was a provision for sending the log information when doing the worker_init request. But after speaking with @soninaren, we understand that this isn't populated. Also, that this approach won't help the OOP language workers for the consumption scenario as specialization doesn't have the same information (warranting a need for new attribute as a part of the specialization process, as proposed by @pragnagopa with AppInit but as a part of FunctionEnvironmentReloadRequest)

  // inform worker of supported categories and their levels
  // i.e. Worker = Verbose, Function.MyFunc = None
  map<string, RpcLog.Level> log_categories = 3;

/cc @soninaren

brettsam commented 2 years ago

Is the proposal here to also add this property to the env reload request? Then we just need to ensure that we populate it from the host, correct?

brettsam commented 2 years ago

We just had a bit of a design discussion here and we have a slightly flipped proposal: the worker can pass the category : level mappings back to the host during worker init and any values in here overwrite all logLevels in host.json.

The flow would be:

  1. Host reads and applies host.json LogLevels
  2. Worker starts up -- in WorkerInitResponse it passes it's configured levels back to the host
  3. If the host sees these values, it re-builds the LoggerFilterOptions to use the filters passed by the worker instead (and logs something indicating it's overriding the host.json settings).

This gives workers the ability to maintain the same filtering on both the worker and the host -- and to have one source of those filters without having to merge or combine host.json filters with worker filters.

In .NET, we should be able to allow customers to register all the filters they want in the worker using the standard ILoggingBuilder, then translate that into the mappings for the WorkerInitResponse.

Other languages can handle it differently (read env vars; read a different config; allow code to initialize). But as long as they understand it (and can translate it to the mapping for the host), it would be supported.

vrdmr commented 2 years ago

Thanks @brettsam

I feel this proposal solves one of the problems with differences in logging methodologies between the host and the worker. For e.g. the Python worker has only one logger and that is applied to all the functions, which is different from the host.

One question with the flow - how are you planning to handle the Linux consumption specialization case for this scenario? There isn't any WorkerInitResponse sent during the worker specialization? Is the plan to have the same message type also be added to the FunctionEnvironmentReloadResponse?

/cc @YunchuWang