Azure / azure-functions-durable-extension

Durable Task Framework extension for Azure Functions
MIT License
716 stars 271 forks source link

Class based entity methods cannot return void #937

Open antogh opened 5 years ago

antogh commented 5 years ago

Description

A clear and concise description of what the bug is. Please make an effort to fill in all the sections below; the information will help us investigate your issue.

After reading the doc about entity functions I decided to start immediately some code. I am very new to azure functions but the concepts are explained very well in the doc so I felt confident to start. I created a durable function project, the trigger is http that call a client function that in turn call an orchestrator function that finally call an entity function. The first entity function I used is the Counter copied from the official doc. It works perfectly. I can increment the counter and the state is perfectly kept. When I debug the application breakpoints are hit quickly and precisely in the orchestrator and in the entity function.

Happy that everything worked I created a 2nd entity function and tried to call it from the orchestrator, same way as I did with the counter. This time unfortunately things went wrong. when I debug I can see in the func.exe window The entity is scheduled but never started, then after 90 secs the client function exit with code 202. Once, I don't know why , while debugging I got all the calls scheduled before for my new entity function. I had some breakpoints set and they were hit 15 times (all the prev. buffered and not started calls) so my function must be not completely wrong, there is something easy blocking it from starting. I'd like to know some tools to investigate but I have no idea what happening behind the scenes.

NOTE: JavaScript issues should be reported here: https://github.com/Azure/azure-functions-durable-js

thanks God it's C#! :)

Expected behavior

A clear and concise description of what you expected to happen. I'd like my 2nd entity function to be called exactly like the 1st. They are very similar , I don't understand why one works and the other not....

Actual behavior

A clear and concise description of what actually happened. 1st entity function is scheduled, started and completed 2nd entity function is scheduled, never started or completed, only once it happened, I don't know why...

Relevant source code snippets

Sorry copying code from vs2019 into here breaks a little bit formatting, you should anyway get an idea...

this is the code that works:

Orchestrator call

         var entid = new EntityId(nameof(Counter2), userid);
         var myproxy = context.CreateEntityProxy<ICounter2>(entid);
         myproxy.Add(1);

Entity function

      public interface ICounter2 {
         void Add(int amount);
         void Reset();
         Task<int> Get();
      }

      public class Counter2 : ICounter2 {
         [JsonProperty("value2")]
         public int CurrentValue { get; set; }

         public void Add(int amount) => this.CurrentValue += amount;

         public void Reset() => this.CurrentValue = 0;

         public Task<int> Get() => Task.FromResult(this.CurrentValue);

         [FunctionName(nameof(Counter2))]
         public static Task Run([EntityTrigger] IDurableEntityContext ctx)
             => ctx.DispatchAsync<Counter>();
      }

this is the code that doesn't work:

Orchestrator call

var entid = new EntityId(nameof(Gogv), userid);
var myproxy = context.CreateEntityProxy<IGogv>(entid);
string result = await myproxy.ExecIntent(reqPayload);

Entity function

      public interface IGogv {
         Task<string> ExecIntent(string request);
      }

      public class Gogv : IGogv {

         [JsonProperty("value3")]
         public int CurrentValue { get; set; }

         [FunctionName(nameof(Gogv))]
         public static Task Run([EntityTrigger] IDurableEntityContext ctx)
             => ctx.DispatchAsync<Gogv>();

         public Task<string> ExecIntent(string reqStr) {
            return Task.FromResult("not implemented");
         }
}

Known workarounds

Provide a description of any known workarounds you used. none so far

App Details

very simple app just to start experimenting with durable functions

Screenshots

If applicable, add screenshots to help explain your problem.

If deployed to Azure

We have access to a lot of telemetry that can help with investigations. Please provide as much of the following information as you can to help us investigate! No I'm testing locally

If you don't want to share your Function App or storage account name GitHub, please at least share the orchestration instance ID. Otherwise it's extremely difficult to look up information.

antogh commented 5 years ago

Hi @cgillum I have some updates.

I have found a workaround to make my code work and get my entity function called.

I use now a different calling convention and it seems to work.

This is the caller code:

`

  var entid = new EntityId(nameof(Myentity), userid);
  string result = await context.CallEntityAsync<string>(entid, "execintent", reqPayload);

`

and this is the new entity function code:

`

 [FunctionName("Myentity")]
  public async static void Myentity([EntityTrigger] IDurableEntityContext ctx) {
     int currentValue = ctx.GetState<int>();

     switch (ctx.OperationName.ToLowerInvariant()) {
        case "execintent":
           string req_payload = ctx.GetInput<string>();
           var gogv = new Gogv();
           string result = await gogv.ExecIntent(req_payload);
           ctx.Return(result);
           break;
     }

     ctx.SetState(currentValue);
  }

`

Using this call convention the entity function gets called, receives the parameter and returns the result. Now I am starting to implement a more complicated state and logic. This is indeed an experiment porting an existing application using reliable actors (Service Fabric). I love this programming model and when I knew about durable entities I was immediately curious to try it. Having a very similar programming model with a serverless system is really great and I think unique to azure (ahead of other cloud vendors)

The reason I tried the other calling convention with proxy and interface is that reminded me closely Service Fabric reliable actors, which I like very much.

I know this is probably not the right place to ask questions but I'd like to ask if the entity function implement some sort of single thread execution peeking calls from a queue (separate queues for each actor id) one per time like SF actors (dont take another call from the queue until the last call ended). I think I read so somewhere but I am confused because I also saw the doc recommends locking the caller to avoid race condition. I am very new to durable functions so I might find the answer myself studying it better. Anyway if someone want to clarify this for me that would speed up my learning process.

Thank you

cgillum commented 5 years ago

Thanks @antogh for the feedback! We actually take lots of questions here, so no worries about that.

Yes, entity functions have single-threaded execution such that operations are processed in-order and one-at-a-time. That behavior by itself is what prevents race conditions. If two messages arrive at an entity at the same time, they'll still be processed one-at-a-time. We mention this in the entity trigger binding documentation.

The locking documentation you saw is actually a slightly different feature we call critical sections. I wouldn't worry about that for now.

I'm glad you're unblocked, but I still want us to investigate why your previous implementation didn't work. Adding @sebastianburckhardt for FYI.

antogh commented 5 years ago

Hi @cgillum thanks for the explanation.

Some more feedbacks from my newbie journey with durable entities.

I am going ahead experimenting durable entities and so far I have a good experience. I am converting a small service fabric reliable actors application into durable functions. It's a google assistant vocal application published 1 month ago having already many active users. Since it's strongly user centric (preferences, personal notes saved, etc.) you can imagine it's a good fit for the durable entities.

To my suprise durable entities handle without problem a complex state with many collections and other types. On my local simulator sometime is a bit slow to start the entity function, but most of the times is fast.

I had a problem when changing the state from the integer counter taken from the sample to my complex state . The ctx.GetState threw an exception because it had a differnt type already stored for that entity ID from previous executions. I tried to find the state in the azure storage local simulator to delete it but I couldn't find it. Where do you persist the state in the local dev simulator?

In the end I changed my entity name to have a diffent entity ID and that solved the problem. Now it is working fine and I am converting the logic from SF. It's easier than I expected. I can't wait to have the application finished and publish on azure to see performance and cost.

stevekerrcarlisle commented 4 years ago

I was recently stuck on a similar issues (latest version of everything) and just found the solution. I'm posting it here to help others. My problem was that I could successfully get an orchestrator to call my durable entity functions, however the orchestrator would never continue processing after the call. It appeared to just keep on waiting for a response. However, the orchestrator also exited successfully?????

The problem??? My orchestrator was returning void. When I rechecked the example code, I noticed the example was returning Task. I changed my code to do this and it started working as expected. I don't know if this counts as a bug.

        [FunctionName("ExampleOrchestrator")]
        public static async Task RunOrchestrator(
            [OrchestrationTrigger] IDurableOrchestrationContext context, ILogger log)
SteveLockley commented 4 years ago

Steve's comment is the answer, I just spent 3 hours trying to figure why my CallEntityAsync and CreateProxy functions just disappeared into a black hole when executed. You absolutely cannot specify the orchestrator return type as void. If it is made Task everything works as documentation

ConnorMcMahon commented 3 years ago

I will attempt to validate this.

If this is the case, I will see if there is a simple fix. If not, we can add an analyzer case to catch this.