Azure / azure-functions-host

The host/runtime that powers Azure Functions
https://functions.azure.com
MIT License
1.94k stars 441 forks source link

Dependency resolution issues in extension #4914

Open fabiocav opened 5 years ago

fabiocav commented 5 years ago

Okay, after spending some time on some other stuff, I've come back around to this. I've spent all day narrowing down what my problem is. I've affectionately called it "scope creep", because the DI framework doesn't appear to be honouring the scope (affection is really not what I'm feeling in regards to these DI issues).

It all comes down to this specific registration using Microsoft.Extensions.Http (2.2.0):

builder.Services.AddHttpClient<IScopeCreepService, ScopeCreepServiceClient>();

I've repro'd it at: https://github.com/t-l-k/azure-functions-di-bug.git

I've added the prerelease CLI tools Azure.Functions.Cli.min.win-x64.2.7.1513.zip into the repo and they're expanded in a Directory.Build.props Unzip build step (requires > MSBuild 15.8), the project references the contained func.exe as the debug executable. I've done this because the Artifact drop is no longer availables. perhaps @fabiocav we could have another build (retained)? https://github.com/Azure/azure-functions-core-tools/commit/7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a -

You have to run it in this version, because without the 2.0.12620.0 extension, it doesn't run at all (everything is a singleton chaos).

When running, the bug manifests in the class MediatrWarmupExtension (which was a behaviour I was exploring to workaround the issue), line https://github.com/t-l-k/azure-functions-di-bug/blob/23919854b3791311ae48bcc67cc4f3f5a183386d/Extensions/MediatrWarmupExtension.cs#L40

var handler1 = scopedProvider.GetRequiredService<IRequestHandler<ScopeCreepCommand, ScopeCreepCommandResult>>();
var handler2 = scopedProvider.GetRequiredService<IRequestHandler<ScopeCreepCommand, ScopeCreepCommandResult>>();
// ^^ The bug manifests at this point ^^ handler1 and handler2 dependencies should be the same. 

The trace level output of the application displays this as part of an extension initialisation:

Azure Functions Core Tools (2.7.1513 Commit hash: 7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a)
Function Runtime Version: 2.0.12620.0
Can't determine project language from files. Please use one of [--csharp, --javascript, --typescript, --java, --python, --powershell]
[10/09/2019 15:58:47] Building host: startup suppressed:False, configuration suppressed: False
[10/09/2019 15:58:48] [0] graphObjectA 15948253 (from 66477871 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:48] [0] graphObjectB 53858013 (from 66477871 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:48] [0] graphObjectA 8857874 (from 51797632 (DependencyGraphBravo)) 4
[10/09/2019 15:58:48] [0] graphObjectB 14008467 (from 51797632 (DependencyGraphBravo)) 4
[10/09/2019 15:58:51] [1] graphObjectA 8857874 (from 29236951 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:51] [1] graphObjectB 14008467 (from 29236951 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:51] Warmed up IRequestHandler for Func.Canary.Application.ScopeCreepCommand
Hosting environment: Production
Content root path: D:\source\repos\t-l-k\Func.Canary\bin\Debug\netcoreapp2.2
Now listening on: http://0.0.0.0:7071
Application started. Press Ctrl+C to shut down.

Commenting out the AddHttpClient registration behaves thus:

Azure Functions Core Tools (2.7.1513 Commit hash: 7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a)
Function Runtime Version: 2.0.12620.0
Can't determine project language from files. Please use one of [--csharp, --javascript, --typescript, --java, --python, --powershell]
[10/09/2019 15:59:49] Building host: startup suppressed:False, configuration suppressed: False
[10/09/2019 15:59:50] [0] graphObjectA 50788593 (from 59927501 (DependencyGraphAlpha)) 4
[10/09/2019 15:59:50] [0] graphObjectB 50517987 (from 59927501 (DependencyGraphAlpha)) 4
[10/09/2019 15:59:50] [0] graphObjectA 50788593 (from 54244775 (DependencyGraphBravo)) 4
[10/09/2019 15:59:50] [0] graphObjectB 50517987 (from 54244775 (DependencyGraphBravo)) 4
[10/09/2019 15:59:52] Warmed up IRequestHandler for Func.Canary.Application.ScopeCreepCommand
Hosting environment: Production
Content root path: D:\source\repos\t-l-k\Func.Canary\bin\Debug\netcoreapp2.2
Now listening on: http://0.0.0.0:7071
Application started. Press Ctrl+C to shut down.

Note the absence of graphObjectA and graphObjectB preceded by [1]. With the bug active, a second service DependencyGraphAlpha is constructed, in the same scope, for the 2nd transient handler resolution. The other parameter, DependencyGraphBravo, is reused correctly across both instances.

Here are all the registrations in the example:

services.AddMediatR(typeof(ScopeCreepCommandHandler));
services.AddScoped<DependencyGraphAlpha>();
services.AddScoped<DependencyGraphBravo>();
services.AddScoped<GraphObjectA>();
services.AddScoped<GraphObjectB>();
services.AddScoped<IGraphInterfaceA>(sp => sp.GetRequiredService<GraphObjectA>());

I did try experimenting with service provider sub-scopes, but doing so appeared to offend the Function's host runtime. I wish to point out, that whilst I have materialised this in an extension, the exact same behaviour was being exhibited when resolving a HttpClient reference within a durable function activity call.

My temporary workaround is to just resolve all of my HttpClient consumers in an initialisation step, for the time being, but obviously it's not ideal I don't know where else this bug may materialise.

Please fix! :cry:

Originally posted by @t-l-k in https://github.com/Azure/azure-functions-host/issues/3399#issuecomment-530010253

APIWT commented 5 years ago

This is really important to us as well, 👍

Blocked on using DI until this is done

fabiocav commented 5 years ago

@APIWT is your issue identical to what is described above?

fabiocav commented 5 years ago

@t-l-k can you also confirm you see the same behavior if you take a direct dependency on IScopeCreepService?

t-l-k commented 5 years ago

@fabiocav I removed the dependency to IScopeCreepService from my DependencyGraphAlpha & Bravo classes, and changed my example RequestHandler's constructor to take a direct dependency on IScopeCreepService:

public ScopeCreepCommandHandler(DependencyGraphAlpha alpha, DependencyGraphBravo bravo, IScopeCreepService creepService)
        {
            // This is where the injected dependencies go a bit skew-iffed
            GraphAlpha = alpha;
            GraphBravo = bravo;
            CreepService = creepService;
        }

And now both handler1 and handler2 each have newly constructed DependencyGraphAlpha and DependencyGraphBravo objects, each with new, distinct GraphObjectA and GraphObjectB references. The IScopeCreepService is presumably registered as Transient, and I'm not really bothered about my http client consumers. I'm more worried aobut my the rest of my dependency graph :smile:

I expanded the example to resolve a handler3 and handler4. These have the same scoped resolutions as handler2. Which is what I'd expect. handler1 is just all wrong.

I pushed it as branch direct in the repro repo.

t-l-k commented 5 years ago

@fabiocav plus it all works as I expect it should if I resolve IScopeCreepService first.

Branch service-first

Hence my workaround: resolve all HttpClient consumers first before handling the first request. Don't know if it's foolproof.

t-l-k commented 5 years ago

@fabiocav I reproduced the issue on 3 different laptops, does it reproduce for you/your teams? It seems as if we're still experiencing some DI issues even with the "workaround", suggesting that its not really effective.

fabiocav commented 5 years ago

@t-l-k we haven't had this assigned yet, so time spent has been limited. My hope is to have cycles for this issue next week at the beginning of the coming sprint. I'll keep providing updates here as we work on the problem.

t-l-k commented 5 years ago

Just wanted to expand upon some cost effective mitigation strategies 🤔

@fabiocav would you possibly be able to comment: A quick search appears that it is possible to use WebJobs SDK standalone with Web API?

1) If so, presumably it would allow us to choose our own Dependency Injection framework?

2) If so, would there be any obvious limitations, e.g. when interacting with storage bindings, etc, vs using the Functions hosting? We're using: HttpTriggers, OrchestationTrigger, QueueTrigger, plus some SignalR output bindings.

3) If so, could a WebJobs SDK enabled WebAPI feasibly be hosted in, say, AKS Kubernetes?

If you could give us a steer it would be great, thanks!!

t-l-k commented 4 years ago

@fabiocav

Have tried this again using func.exe on functions v2.0.12858.0 and it is still broken.

I think this is the root cause of my other issue #5183

And I'll tell you why this is my suspicion, because on a recent automated integration test build, testing my durable functions (which normally run acceptably in accordance with my current requirements), the DryIoc scope disposed style exceptions closed out the log and hung the test runner.

But immediately prior to the DryIoc exceptions, I had a transactional error thrown by TransactionScope, because it tried to enlist two resources into an ambient transaction and thus elevate to a distributed transaction, e.g. probably 2x DbContexts / 2x SQL Connections. This of course threw an exception in the Docker container, because DTC is not supported.

Therefore this leads me to believe that the DI container had injected two variants of the same scoped service, which were then individually addressed/activated within the transaction scope. Now, the merits of using TransactionScope aside, it is library code from another system. Conceptually it should work fine, and indeed works fine on other ASP.NET Core systems, and is activated with all the necessary async flow parameters.

So if we could get to the bottom of why the graphObjects are initialised the way there are in this issue, it may resolve my other one. My application is designed to only have a single transactional resource within a function execution scope, and that is a single DbContext.

t-l-k commented 4 years ago

@t-l-k we haven't had this assigned yet, so time spent has been limited. My hope is to have cycles for this issue next week at the beginning of the coming sprint. I'll keep providing updates here as we work on the problem.

@fabiocav are there any updates to share on this issue? This iissue is of the highest priority for our team, and we need some information to help calculate our current risks.

APIWT commented 4 years ago

@fabiocav We are also very much in need of a fix here. It is extremely high priority. When can we expect this to be released?

fabiocav commented 4 years ago

@APIWT there has been some progress in this area, and some of those were shipped in Azure Functions 3.0 (may not impact all scenarios here, but it has addressed a set of issues around improper scope disposal). That will be in 2.0 in a release in January, but if you have the ability to test in 3.0, it would be great to understand whether that had an impact on your scenarios.

APIWT commented 4 years ago

@fabiocav We are currently using 3.0 and we have the exact same issue described in the issue's description here. When using AddHttpClient, scoped services no longer behave as such. This causes things like the DbContext for EntityFrameworkCore to be pretty useless unfortunately. I feel like this should be extremely high priority considering how wide of an impact it would have.

darrenhull commented 4 years ago

@fabiocav I have also reproduced this in 3.0 and it is also a blocking issue for us too. Are there any workarounds for this issue?

t-l-k commented 4 years ago

@fabiocav As previously mentioned, we've this contained as a known issue and are tracking it with a custom exception. Here's my 30 day to date report rate for it:

image

image

So yeah, it's quite significant. This is a low volume application (mission critical mind), but the occurrence rate of this is basically 1 in 3-4 operations. More prominent apparently, in durable orchestrations. Back of an envelope calculation mind, and not accounting for double counts or whether or not a failed op is included as a successful op under performance.

The application architecture is basically using Mediatr in the middle, the supporting domain and infrastructure layers behind which require correctly scoped dependencies in order to function correctly. Basically, built to Microsoft's recommended microservices patterns architecture advice. I'm just glad that Azure Functions runtime is resilient enough that we're not losing any transactions (we're double logging).

Might cost some people quite a bit more to run it in this manner though, bit like a gas guzzler!

holtalanm commented 4 years ago

We are experiencing this issue, as well. AddHttpClient with generic arguments for creating a service tied to the HttpClient factory causes scoped dependencies to be instantiated more than once per request.

darrenhull commented 4 years ago

@fabiocav can we get an update on the status of this issue please? It’s been going for nearly a year now, a whole year. Is there something the community could do to help? If so give us/me a steer and we can try and take a look. If we could get an update it would be appreciated, We’ve got branches of code that I just cannot pull in because of this issue.

RillistikPete commented 4 years ago

Same here. I have been using Core Tools 2.7 and all of a sudden I'm getting Can't determine project language from files. Please use one of [--csharp, --javascript, --typescript, --java, --python, --powershell]

NoelOConnell commented 4 years ago

We are experiencing this issue, as well.

We are using Mediatr with Azure Functions. We have a service registered as Scoped but we can see that on approximately every 3rd function run the Scoped service is being initialized twice.

jonathanantoine commented 4 years ago

Any update :) ? Facing the same issue :)

jonathanantoine commented 4 years ago

Please don't let up in the void @fabiocav :(

espray commented 4 years ago

See issue https://github.com/Azure/azure-functions-host/issues/5098

ebwinters commented 4 years ago

See my reply here: I fixed it https://github.com/Azure/azure-functions-host/issues/5098#issuecomment-717646798

t-l-k commented 4 years ago

Follow up for context ... spot the moment Functions runtime 3.0.14785 becomes active with appsetting AzureWebJobsFeatureFlags = EnableEnhancedScopes 😉 image

This issue is resolved by #5098.

prashantagrawal3108 commented 1 year ago

I am also facing the same issue, but not using the HttpClient, I have explained my issue on https://github.com/Azure/azure-functions-host/issues/9316.

Can anyone suggest a fix?