Azure / durabletask

Durable Task Framework allows users to write long running persistent workflows in C# using the async/await capabilities.
Apache License 2.0
1.51k stars 289 forks source link

Upgrade to a later version of the Storage SDK for DurableTask.AzureStorage #516

Open ConnorMcMahon opened 3 years ago

ConnorMcMahon commented 3 years ago

We currently use v9.3.1 of the Azure Storage SDK, which is very out of date and is no longer maintained.

Upgrading to later versions of the SDK should allow us to get bug-fixes for issues like the deadlock we get that has caused many issues for us historically, as well as get us new features and support for key vault integration, as well as hopefully performance improvements.

Ideally, due to the structure of the Azure Storage team updating the namespaces of their types across major versions, we may be able to accomplish this as a non-breaking change, at least for Durable Functions.

ConnorMcMahon commented 3 years ago

SDK versions to target:

Tables - Azure.Data.Tables v12 (still in beta) Blobs - Azure.Storage.Blobs v12 Queues - Azure.Storage.Queues v12

kzu commented 3 years ago

Kinda mind-boggling that this is still using a 3 year old, deprecated API that's in production right now image

So much for the benefits of (mostly) evergreen SaaS via serverless Azure functions... 🤦 .

dim-37 commented 2 years ago

Because of this crazy issue with WindowsAzure.Storage package where it cannot parse UseDevelopmentStorage=True where true is written in capital I lost all the working day!!!

Why the hell do you still use it???

xperiandri commented 2 years ago

Really. Is it so hard to rewrite a few classes?!

cgillum commented 2 years ago

@xperiandri @dim-37 Swapping NuGet packages is obviously not hard. Navigating conflicting dependencies with other projects (e.g., Azure Functions through Durable Functions) is the thing that's hard - otherwise we'd have done this years ago. The good news is that most of the blocking dependency issues have been resolved and we plan to do this work soon.

EricStG commented 2 years ago

Our Azure storage accounts are recommending we upgrade packages, but since it's by the durable extensions it's not something we can do easily.

image

a2741890 commented 2 years ago

Any update? 🤦

tom-configo commented 2 years ago

Ditto, any update? We're being severely clobbered by #1403

xperiandri commented 2 years ago

The good news is that most of the blocking dependency issues have been resolved and we plan to do this work soon.

@cgillum how soon is that "soon" can happen 🙂?

cgillum commented 2 years ago

Not sure how soon, unfortunately. There's no shortage of work to do and this item unfortunately has a hard time making it to the top of the priority list. If the community would be willing to help with a PR, that could really help get it done sooner! 😉

I suspect one of the biggest challenges will be ensuring backwards compatibility, particularly with our recently added support for managed identity by @wsugarman.

Another thing that will need to keep in mind is that we need to use a version of the new storage SDKs that are consistent with those used by Azure Functions to avoid conflicts in Durable Functions. Using the same storage version as Microsoft.Azure.WebJobs.Extensions.Storage.Blobs, for example, is likely the best path forward.

xperiandri commented 2 years ago

OK. Any other restrictions? I will have a look if I can prepare a PR

xperiandri commented 2 years ago

Has someone already started to do something about that to pick up intermediate result?

xperiandri commented 2 years ago

Can I:

  1. Add .editorconfig and enforce some code-style policies?
  2. Switch to C# 10 file scoped namespaces?
  3. Migrate to Packages.config central NuGet package version management?
  4. Update Microsoft.Extensions to 6.0.*?
  5. Throw await Newtonsoft.Json?
cgillum commented 2 years ago

Thanks @xperiandri for generously offering to investigate this!

Has someone already started to do something about that to pick up intermediate result?

Actually, @wsugarman messaged me privately earlier today and mentioned he was thinking of looking into this as soon as this weekend. He's made big changes to the code before and knows some of the details of how we use the storage SDKs already, making him an excellent candidate.

Add .editorconfig and enforce some code-style policies?

I would say yes only if the actual changes are limited to support the existing code styles. What I wouldn't want to do is create a lot of unrelated code churn as part of a storage SDK upgrade PR. Some projects are also managed by different groups, and those groups may also have different code style preferences.

Switch to C# 10 file scoped namespaces?

Yes, but if it's a separate PR. I wouldn't want to mix that in with unrelated work.

Migrate to Packages.config central NuGet package version management?

I don't have experience with this so I'd have to do some research to understand what kind of impact this might have to our build and release pipelines.

Update Microsoft.Extensions to 6.0.*?

No - we're very restricted in terms of which dependencies we can support due to our very heavy usage by Azure Functions. This is the whole reason why we're on such an old version of the storage SDK to begin with.

Throw await Newtonsoft.Json?

This would be a massively breaking change and I don't think we have sufficient justification to take that on. I think @wsugarman may have looked into this as well and gave up at a certain point because of the impossibility of maintaining backwards compatibility. We'd need to come up with something really creative so that we could continue supporting Newtonsoft.Json while allowing other users to swap in something else.

tom-configo commented 2 years ago

@cgillum @wsugarman @xperiandri I'm very grateful to hear that there's hope for movement on this issue, many thanks.

My experience as a new user of durable functions (and Azure functions in general) who happened to start out with longer-running CPU-bound activities has been truly miserable. I spent a over a week trying to diagnose why the worker process was being silently killed (with no signal via the bindable CancellationToken that's normally used for semi-graceful shutdown). I was about to give up on durable (and regular) functions when my n'th round of googling and github searching finally found this and related issues. The workaround has been to regress to the dark ages of cooperative multitasking, which for us involved modifications deep into a large, complex, multi-library engine. Luckily that's our own code, otherwise we'd have no workaround. I would be very glad to strip out those changes, which introduce a 2 second sleep every 9 seconds or so during a complex multi-stage CPU-bound zero-IO computation. That computation now takes about 6 minutes (so we're wasting around 1 minute sleeping).

Unless a fix is truly imminent, might I suggest flagging this problem/incompatibility in thirty-foot-high letters of fire in the introductory documentation that new users of durable functions are likely to read first?

xperiandri commented 2 years ago

@cgillum what is the minimum .NET we must target? 3.1?

xperiandri commented 2 years ago

@tom-configo is it specific to Durable? I suppose it is a matter of Functions runtime. Maybe some old Functions runtime.

cgillum commented 2 years ago

what is the minimum .NET we must target? 3.1?

.NET Standard 2.0

xperiandri commented 2 years ago

What about the other projects that are .NET Core now?

xperiandri commented 2 years ago

What is the minimum version of Microsoft.Extensions packages? I suppose we target Functions v4 minimums?

cgillum commented 2 years ago

What about the other projects that are .NET Core now?

I'm just talking about the DurableTask.AzureStorage project. This and DurableTask.Core are the two projects that have tight restrictions in terms of dependencies because of how they are used in Azure Functions. Other projects have much more flexibility in terms of dependencies.

What is the minimum version of Microsoft.Extensions packages? I suppose we target Functions v4 minimums?

In general, it's going to be whatever minimum versions we have today. To give more context, we have to continue to maintain compatibility with the Azure Functions v2 runtime since we unfortunately have hundreds of customers still using it. In many situations, upgrades are forced on them and any such forced updates must not break their app.

xperiandri commented 2 years ago

Why can't we increment the major version and drop all functions earlier than v4?

xperiandri commented 2 years ago

So you will have 2 branches:

  1. main → current version
  2. functions v2 compatible → for bug fix only
cgillum commented 2 years ago

Why can't we increment the major version and drop all functions earlier than v4?

Because it's more work for the team to maintain and support multiple versions of the same package. We'd much rather maintain just one version unless we have a strong reason to maintain more. Also, our goal is to move everyone onto the newer storage SDKs, not just the subset of users on Functions V4.

tom-configo commented 2 years ago

is it specific to Durable? I suppose it is a matter of Functions runtime. Maybe some old Functions runtime.

@xperiandri My understanding from the various issue discussions is that it's specific to Durable due to the old version of an Azure storage package that Durable references, and uses for its own state management. As far as I know, pure Functions has no hard-wired Azure storage dependency - for example, you must add a reference to Microsoft.Azure.WebJobs.Extensions.Storage.Blobs to be able to use [Blob(...)] bindings. From my understanding (non-expert), the problematic reference is WindowsAzure.Storage, as seen here: image And WindowsAzure.Storage is old and deprecated (and has the deadlock bug).

xperiandri commented 2 years ago

@cgillum I'll create pull request step by step

xperiandri commented 2 years ago

This is the first one https://github.com/Azure/durabletask/pull/767