Azure / azure-functions-host

The host/runtime that powers Azure Functions
https://functions.azure.com
MIT License
1.94k stars 442 forks source link

Subtle differences in DI between WebJobs.Script.WebHost and Microsoft.Azure.WebJobs/Microsoft.Extensions.DependencyInjection #3399

Closed ielcoro closed 5 years ago

ielcoro commented 6 years ago

Tryin to create a extension that worked both on Microsoft.Azure.WebJobs and Azure Functions (Script.WebHost), I found that there were differences between how the DI container behaves on Azure Functions on how it does in WebJobs (the standard behavior from Microsoft.Extensions.DependencyInjection), that makes a pain to create a extensions that works on both, or to migrate an application that worked in ASP.NET to functions:

Scenario Script.WebHost Webjobs/Microsoft.Extensions.DependencyInjection
A constructor parameter dependency is not registerd with DI Silently injects null Throws an exception explaining that the dependency could not be resolved
A scoped delegate factory is used Any dependency resolved in the factory becomes a singleton Any dependency resolved from the factory respect their registered lifetime

I could work around 1, but I couldn't do anything than use another separate DI container instance for my extension services for 2.

To illustrate the second scenario example, let's think that we want to use the well-known MediatR library, this library uses a delegate for resolving handlers, and we want those handlers to be scoped for whatever reason (compatibility, isolating the object graph between function instancesโ€ฆ) :

delegate object ServiceFactory(Type type);

builder.Services.AddScoped<IMediator>();
builder.Services.AddScoped<ServiceFactory>(provider => (type) => provider.GetRequiredService(type));

By default Mediator register it dependencies as scoped. The factory, in the default WebJobs/Microsoft DI implementation, ends using the scope from wich it was resolver and passing it as the provider parameter to the factory lambda. However from my tests, in Azure Functions, it always uses the root service provider instance, thus making any scoped dependency from "type" effectively a singleton.

Looks like this line in DryIOC could be causing this severe difference in behavior.

Iยดm sure the team has a reason for deviating from the standard DI solutions that the parent WebJobs project happens to be using, however wouldn't be better and much simpler, to have a consistent behavior between Azure Functions - WebJobs and ASP.NET default implementations? That would ease the creation of extensions and reduce the changes needed to migrate existings ASP.NET projects.

DanielSmon commented 6 years ago

+1 on this issue. I've spent a while investigating DbContext exceptions because of this. My case is similar in that each MediatR handler instance receives the same DbContext even though the Handler instance is unique and is called from a unique service scope.

Thanks @ielcoro for identifying the root cause.

jtemperv commented 5 years ago

Using the same pattern and libraries. Is this still an active issue?

ielcoro commented 5 years ago

@jtemperv Yes, that line hasn't changed, it's still and issue, and I think it will not be resolved until this is done at the webjobs-sdk level first: https://github.com/Azure/azure-webjobs-sdk/issues/2078 and then they start adapting those changes to azure-functions

brettsam commented 5 years ago

@ielcoro -- can you try the scoped scenario again? Scoped services weren't supported until this PR went in a few weeks ago: https://github.com/Azure/azure-webjobs-sdk/pull/2133

ielcoro commented 5 years ago

@brettsam Thanks for looking into this.

Tried, now scoped services work BUT the issue with scoped delegate dependencies is still happening. I created a sample repo here: https://github.com/ielcoro/ScopedDIDemo

When the scoped service is a dependency of a registered factory, it is injected as a Singleton. That is happening because in this line https://github.com/Azure/azure-functions-host/blob/5294afc32a34e5783caade40a2629427cee0418a/src/WebJobs.Script.WebHost/DependencyInjection/DryIoc/DryIocAdapter.cs#L156 DryIOC is handing the root IServiceProvider to the factory function, instead of using the current scoped one.

The good news is that, the first behavior difference ("A constructor parameter dependency is not registerd with DI" has been fixed, and now it throws like the builtin one.

However, what worries me most, is that the longer the functions runtime keeps building on top of a different IOC library, instead of building on top of using .NET Core one (Microsoft.Extensions.DependencyInjection), more of this subtle issues could appear, specially in complex .net ecosystem libraries. Those differences can make both hard to migrate existing apps to functions and the functions framework itself costly to mantain, as customers report weird issues when using libraries like EF, Mediator, Health and many others.

PureKrome commented 5 years ago

Just want to +1 this comment from @ielcoro

However, what worries me most, is that the longer the functions runtime keeps building on top of a different IOC library, instead of building on top of using .NET Core one (Microsoft.Extensions.DependencyInjection), more of this subtle issues could appear, specially in complex .net ecosystem libraries.

meixger commented 5 years ago

I'm also using MediatR and run into DbContext errors -- exactly as @ielcoro described above.

๐Ÿ™Hopefully this will be addressed in the next release as mentioned in Dependency Injection support for Functions #3736 ?

For transparency, and to better set expectations, we're currently targeting Sprint 48 (due by May 1st) to complete this.

[Edit] Seeing activity in #3736 maybe @fabiocav can drop a comment here?

espray commented 5 years ago

+1 blocked on the same issue

APIWT commented 5 years ago

I have the same issue with no usage of a Delegate Factory. I think this issue is just with scoped services in general. I'm not exactly sure what causes them to get treated as a singleton but this is blocking us.

espray commented 5 years ago

@fabiocav @jeffhollan any update?

fabiocav commented 5 years ago

@espray apologies for the delay on providing an update. We'll triage this again today (as this wasn't addressed on the sprint it was assigned to) so we can provide an update ASAP.

APIWT commented 5 years ago

@fabiocav Yes please and thank you very much! :) We have so much code in our project currently that just does the DI and we have been looking forward to removing it.

jwisener commented 5 years ago

ugh, we have just hit this issue as well. We are really trying to use the power of azure functions in our current sprint.

espray commented 5 years ago

@jwisener because of this issue, I would recommend using attribute/binding DI and NOT the AzFunc/WebJob DI. Been using attribute/binding DI for better part of a year for a multi-tenant SaaS product with out any problems.

https://github.com/espray/azure-webjobs-sdk-extensions -OR- https://github.com/BorisWilhelms/azure-function-dependency-injection

jwisener commented 5 years ago

@espray thanks for the links , I will try to do that. Currently we are attempting to use MediatR for events & handlers via domain events for transnational saving/publishing. The problem I'm encountering is when mediatR calls the eventhanlders the EF Context that is injected is not the right one, it seems like it's a new one (or maybe the singleton, I am still not sure yet). Anyway, we lose any data changes that were made in the event handler run. (in our case adds that occured to the dbcontext).

espray commented 5 years ago

@fabiocav @jeffhollan any update?

fabiocav commented 5 years ago

@espray this is something that is high priority on our list. As soon as there's an update, I'll keep providing updates here, so you won't miss if you're following the issue. I'm hoping to have more details by the end of the week.

PureKrome commented 5 years ago

I'm hoping to have more details by the end of the week.

๐Ÿ™ ๐Ÿ™ ๐Ÿ™ ๐Ÿ™ ๐Ÿ™ Please be that Microsoft.Extensions.DependencyInjection is replacing DryIOC and balanced is returned to the Order of Things ... ๐Ÿ™ ๐Ÿ™ ๐Ÿ™ ๐Ÿ™ ๐Ÿ™

Side Thoughts: So awesome seeing people using MediatR + AzFunctions

fabiocav commented 5 years ago

Just to provide an udpate, as promised (since I'll also be out for a couple of days).

We have a fix for this in validation right now. Because of the nature of the issue, full validation may not be completed before we cut the next release (which happens early next week), so it's possible that it wouldn't be on the next drop, but on the following one. If that happens, we'll investigate the possibility of making this available for validation on Azure with an opt-in runtime version.

APIWT commented 5 years ago

@fabiocav Can you link to the branch?

jtemperv commented 5 years ago

Hi @fabiocav any updates if it made the release or is available as an opt-in feature? We're really struggling with a project that needs to go-live. Having more and more difficulties postponing it. On the other hand I don't want our engineers investing too much time in a workaround if the fix is just days away instead of weeks. Thanks for your understanding.

fabiocav commented 5 years ago

@jtemperv the release for this sprint is currently going through validation and this fix was not included. The release cadence is 2 weeks (The estimated for Sprint 55, which is the current, is to start validation around around August 7th and deployment around August 12th), but I'll see how feasible it is to make this available sooner.

@APIWT I'll have a PR linked to this here soon.

APIWT commented 5 years ago

@fabiocav Any update or link to a PR?

fabiocav commented 5 years ago

Apologies for the delay. We've been dealing with deployment validation/issues for the current release which pushed some of this work back. I'll have a basic PR open and linked with the core change we'll be making tomorrow. I'll flag the build so people watching the issue can deploy a private functions runtime with those changes for validation. We'll also have a version of the CLI available for selection in VS on a preview channel.

Just to set the expectations, because of the deployment delay related to what I've mentioned above, combined with people with context on this issue (myself included) being out of the office a period of time this month, we'll likely not have this deployed to production until early September. I'll keep this issue updated as that becomes more concrete, but my goal is to enable full validation, locally and on Azure, before that deployment. Thanks for the patience!

fabiocav commented 5 years ago

Following up on the previous message.

We've made a preview version available with the base set of changes to address the last remaining issue tracked here (scoped delegate factory). We have a process to make using those bits a little easier, but this is something we're testing out, so there's no simple automation, requiring the following steps to opt in:

This will get you running with the new version that contains the fixes.

To test this on Azure, you need to deploy a matching version of the runtime following the instructions here to deploy the following runtime (dedicated only): https://github.com/Azure/azure-functions-host/releases/download/v2.0.12620/Functions.Private.2.0.12620-prerelease.no-runtime.zip

I'll be unavailable for a while, but will keep an eye on this issue as I can. Please provide feedback if/when you have a chance to test this. Thanks!

meixger commented 5 years ago

Success! I was able to get MediatR's scoped dependency factory to work like expected (re: jbogard/MediatR.Extensions.Microsoft.DependencyInjection#74.

I tested locally and on Azure with a sample project. Will test with a real app soon...

jtemperv commented 5 years ago

@fabiocav thanks for the update. I'll check this out asap. Enjoy your holidays!

t-l-k commented 5 years ago

@fabiocav

This will get you running with the new version that contains the fixes.

Having followed your advice, starting such a project yields the following startup:

Azure Functions Core Tools (2.7.1558 Commit hash: eff964a0c658b5752ea77764111e17eefcdc3f8c)
Function Runtime Version: 2.0.12625.0

But I'm guessing I want this for the pre-release version, because as you say this branch https://github.com/Azure/azure-functions-host/tree/asynscopeccontext is not yet merged:

Function Runtime Version: 2.0.12620.0

Confused. What host process should be running this? I've func.exe installed via the Functions CLI tools. This is no good, I'm guessing.

Do I need to dotnet Microsoft.Azure.WebJobs.Script.WebHost.dll ? Is the 1.0.30-beta1 SDK meant to resolve a dependency to a patched version of this dll?

@meixger , @jtemperv are you able to clarify at all?

Thanks

update: In my ignorance, I thought maybe specifying "FUNCTIONS_EXTENSION_VERSION": "2.0.12620" in the configuration might help, but it did not.

Eventually I settled on copying the base Dockerfile at https://github.com/Azure/azure-functions-docker/blob/master/host/2.0/stretch/amd64/base.Dockerfile and customised it to build commit 7fd035c2d041c35fb76e10291e3ce3ac59713076 - then published my function library into it and went from there.

Attempting to run a prerelease WebHost.dll build from Visual Studio presented me with a number of challenges, primarily because it did not respect the local.settings.json file, as that is a func.exe convenience feature. Consequently it meant layering a number of environment variables which are not easily defined in VS.

meixger commented 5 years ago

@t-l-k You are right - as of now it doesn't work anymore on my machine - evidently because the Function Runtime auto updated.

I'm was able to configure it manually and now i get this version information on startup:

Azure Functions Core Tools (2.7.1513 Commit hash: 7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a)
Function Runtime Version: 2.0.12620.0
  1. Download the Artifacts (drop.zip 820MB) from the build server and extract the Azure.Functions.Cli.win-x64.2.7.1513.zip.

  2. Then follow the instructions (solution#2) in setting-azure-functions-runtime-version-for-local-dev-with-visual-studio-2017 (Stackoverflow) to point your Visual Studio debug profile to this runtime.

t-l-k commented 5 years ago

@meixger thanks, this works for me now also. I hadn't realised there was a prerelease build for azure-functions-core-tools. Definitely wanted to keep using func.exe for the convenience factor.

VDKB commented 5 years ago

@meixger We just did the test and it works, but is has to be version Azure.Functions.Cli.win-x64.2.7.1513. Later versions don't work (tested with Azure.Functions.Cli.win-x64.2.7.1575).

jtemperv commented 5 years ago

@fabiocav tested and confirmed, the fix works. Looking forward to see this becoming available in consumption plans. :) thanks for the effort!

t-l-k commented 5 years ago

Has anyone managed to get this working hosted? It works fine in localhost containers and the patched func.exe, but I've tried uploading a test container and Azure just isn't having.

func init MyFunctionProj --docker
func new --name MyHttpTrigger --template "HttpTrigger"

Then updating the SDK to 1.0.30-beta1 and Azure Functions v2-preview ...

  1. Built from the template Dockerfile - works fine, functions work, /admin/host/status reports fine
  2. Built from the template Dockerfile - but with rebuilt official base image (no changes) - looked fine for a while, then suddenly wasn't, functions won't enumerate, /admin/host/status still reports fine though
  3. Built from the template Dockerfile - but rebuilt official base image using source 7fd035c2d041c35fb76e10291e3ce3ac59713076 and same as (2), functions won't enumerate, /admin/host/status reports fine though (and is running what I want, 1.2.12620).

Is there some sort of integrity check on the images when hosted in Azure? I'm puzzled why it appeared to momentarily work in (2), but alas I may just have been fooled by a possible cached result.

I thought Docker might be the easier approach here, and allows me to keep consistent across every environment and dev/test workstations. I'll try an AppService tomorrow.

Anybody else have luck? I've a lot of code needing QA cycles in a deployed environment.

UPDATE: So it definitely does work in an AppService. When running as a container, the detailed diagnostic logging in Kudu's Log Stream (had to enable debug level logging as per .NET Core logging configuration) just reported that "0 functions were found". They were definitely there though, and the image worked locally. Probably some kind of integrity check? Couldn't see anything in azure-functions-host code (didn't look long though).

The advantage of that containerised Linux plan is the sweet 66% discount ... but alas, expensive AppServices it is.

handeeadiguzel commented 5 years ago

Is the deployment still planned for early September ? Looking forward to using this in production.

t-l-k commented 5 years ago

Just a heads up, have been working with this patched func.exe and have observed some more DI weirdness. Weirdness by way of the wrong dependencies being injected within the scope. Like many others who have been bitten by this bug, our implementation leverages Mediatr to keep a clean separation of concerns at a command level.

As it's the end of the week, I haven't yet debugged the issue fully, but my early observation appears to indicate that a number of async style continuations are involved. This weirdness manifests in a behaviour that is dependent on a number of IO waits.

What's odd is that normally the first time the behaviour runs the wrong dependencies are resolved. Yet, on subsequent executions the right dependencies are resolved.

If I have the scope, I'll see if I can repro in a repo next week, but because of the problems we're experiencing I may have to work on something else. It's making testing very difficult.

This might be another issue TBH and of course I have to rule out our own code as the issue ;)

t-l-k commented 5 years ago

Okay, after spending some time on some other stuff, I've come back around to this. I've spent all day narrowing down what my problem is. I've affectionately called it "scope creep", because the DI framework doesn't appear to be honouring the scope (affection is really not what I'm feeling in regards to these DI issues).

It all comes down to this specific registration using Microsoft.Extensions.Http (2.2.0):

builder.Services.AddHttpClient<IScopeCreepService, ScopeCreepServiceClient>();

I've repro'd it at: https://github.com/t-l-k/azure-functions-di-bug.git

I've added the prerelease CLI tools Azure.Functions.Cli.min.win-x64.2.7.1513.zip into the repo and they're expanded in a Directory.Build.props Unzip build step (requires > MSBuild 15.8), the project references the contained func.exe as the debug executable. I've done this because the Artifact drop is no longer availables. perhaps @fabiocav we could have another build (retained)? https://github.com/Azure/azure-functions-core-tools/commit/7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a -

You have to run it in this version, because without the 2.0.12620.0 extension, it doesn't run at all (everything is a singleton chaos).

When running, the bug manifests in the class MediatrWarmupExtension (which was a behaviour I was exploring to workaround the issue), line https://github.com/t-l-k/azure-functions-di-bug/blob/23919854b3791311ae48bcc67cc4f3f5a183386d/Extensions/MediatrWarmupExtension.cs#L40

var handler1 = scopedProvider.GetRequiredService<IRequestHandler<ScopeCreepCommand, ScopeCreepCommandResult>>();
var handler2 = scopedProvider.GetRequiredService<IRequestHandler<ScopeCreepCommand, ScopeCreepCommandResult>>();
// ^^ The bug manifests at this point ^^ handler1 and handler2 dependencies should be the same. 

The trace level output of the application displays this as part of an extension initialisation:

Azure Functions Core Tools (2.7.1513 Commit hash: 7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a)
Function Runtime Version: 2.0.12620.0
Can't determine project language from files. Please use one of [--csharp, --javascript, --typescript, --java, --python, --powershell]
[10/09/2019 15:58:47] Building host: startup suppressed:False, configuration suppressed: False
[10/09/2019 15:58:48] [0] graphObjectA 15948253 (from 66477871 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:48] [0] graphObjectB 53858013 (from 66477871 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:48] [0] graphObjectA 8857874 (from 51797632 (DependencyGraphBravo)) 4
[10/09/2019 15:58:48] [0] graphObjectB 14008467 (from 51797632 (DependencyGraphBravo)) 4
[10/09/2019 15:58:51] [1] graphObjectA 8857874 (from 29236951 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:51] [1] graphObjectB 14008467 (from 29236951 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:51] Warmed up IRequestHandler for Func.Canary.Application.ScopeCreepCommand
Hosting environment: Production
Content root path: D:\source\repos\t-l-k\Func.Canary\bin\Debug\netcoreapp2.2
Now listening on: http://0.0.0.0:7071
Application started. Press Ctrl+C to shut down.

Commenting out the AddHttpClient registration behaves thus:

Azure Functions Core Tools (2.7.1513 Commit hash: 7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a)
Function Runtime Version: 2.0.12620.0
Can't determine project language from files. Please use one of [--csharp, --javascript, --typescript, --java, --python, --powershell]
[10/09/2019 15:59:49] Building host: startup suppressed:False, configuration suppressed: False
[10/09/2019 15:59:50] [0] graphObjectA 50788593 (from 59927501 (DependencyGraphAlpha)) 4
[10/09/2019 15:59:50] [0] graphObjectB 50517987 (from 59927501 (DependencyGraphAlpha)) 4
[10/09/2019 15:59:50] [0] graphObjectA 50788593 (from 54244775 (DependencyGraphBravo)) 4
[10/09/2019 15:59:50] [0] graphObjectB 50517987 (from 54244775 (DependencyGraphBravo)) 4
[10/09/2019 15:59:52] Warmed up IRequestHandler for Func.Canary.Application.ScopeCreepCommand
Hosting environment: Production
Content root path: D:\source\repos\t-l-k\Func.Canary\bin\Debug\netcoreapp2.2
Now listening on: http://0.0.0.0:7071
Application started. Press Ctrl+C to shut down.

Note the absence of graphObjectA and graphObjectB preceded by [1]. With the bug active, a second service DependencyGraphAlpha is constructed, in the same scope, for the 2nd transient handler resolution. The other parameter, DependencyGraphBravo, is reused correctly across both instances.

Here are all the registrations in the example:

services.AddMediatR(typeof(ScopeCreepCommandHandler));
services.AddScoped<DependencyGraphAlpha>();
services.AddScoped<DependencyGraphBravo>();
services.AddScoped<GraphObjectA>();
services.AddScoped<GraphObjectB>();
services.AddScoped<IGraphInterfaceA>(sp => sp.GetRequiredService<GraphObjectA>());

I did try experimenting with service provider sub-scopes, but doing so appeared to offend the Function's host runtime. I wish to point out, that whilst I have materialised this in an extension, the exact same behaviour was being exhibited when resolving a HttpClient reference within a durable function activity call.

My temporary workaround is to just resolve all of my HttpClient consumers in an initialisation step, for the time being, but obviously it's not ideal I don't know where else this bug may materialise.

Please fix! :cry:

fabiocav commented 5 years ago

@t-l-k I've moved your comment to a new issue to make it easier for us to iterate and discuss. It's also a bit different than what we're tracking as part of this issue, so I wanted to make sure we had a dedicated thread.

For the updates for this issue, we'll be releasing those changes with the current sprint deployment.

PureKrome commented 5 years ago

@t-l-k and others : issue #4914 is the new thread, for those who wish to watch that issue.

t-l-k commented 5 years ago

@fabiocav sorry to bother but can you clarify "current sprint deployment"?

Can we expect to target it App Services once it appears in "Releases" using the FUNCTIONS_EXTENSION_VERSION parameter?

Thanks!

fabiocav commented 5 years ago

@t-l-k ; the release I'm referring to would be the release for Sprint 58. That sprint comes to an end on 09/18, and at that point, we begin the validation/release activities, which usually takes about a week or so (for global deployment).

Can we expect to target it App Services once it appears in "Releases" using the FUNCTIONS_EXTENSION_VERSION parameter?

Once that release is fully deployed, existing function apps will be automatically updated, there's no need to take any of the steps for a private deployment or to make any modifications to the FUNCTIONS_EXTENSION_VERSION setting.

VDKB commented 5 years ago

Is it possible to give the current status of this? Just to know when and how we can go to production with our project.

Thanks

fabiocav commented 5 years ago

The changes are going out with this release: https://github.com/Azure/app-service-announcements/issues/201

This started rolling to production environments yesterday and we expect the deployment to be completed by the end of the week.

fabiocav commented 5 years ago

Closing this as the changes to address the issues originally tracked here have been deployed. For other problems/requests in this area, please open new issues.