dapr / python-sdk

Dapr SDK for Python
Apache License 2.0
222 stars 125 forks source link

Initial code for decorators and optional naming of workflows and activities #651

Closed mukundansundar closed 8 months ago

mukundansundar commented 9 months ago

Description

Following through with the discussion in #575, I have added partial validation to the chaining and fan out fan in examples alone as the other two require user input and take a longer time to complete also.

@berndverst and @cgillum For now I have left the register_workflow and the register_activity methods as is without any changes, so users can use either a decorator patter or use register_workflow and register_activity for now.

So if we continue to support both, there has to be the case that any parameter or functionality added as part of the decorator will in essence be added as part of the register_* methods, that is the code flow today also.

Issue reference

We strive to have all PR being opened based on an issue, where the problem or feature have been discussed prior to implementation.

part of #635 closes https://github.com/dapr/python-sdk/issues/623

Checklist

Please make sure you've completed the relevant tasks for this PR, out of the following list:

codecov[bot] commented 9 months ago

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

Comparison is base (c08e714) 86.16% compared to head (e967535) 86.43%.

Files Patch % Lines
...ext-workflow/dapr/ext/workflow/workflow_runtime.py 87.87% 8 Missing :warning:
...orkflow/dapr/ext/workflow/dapr_workflow_context.py 60.00% 2 Missing :warning:
...workflow/dapr/ext/workflow/dapr_workflow_client.py 50.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #651 +/- ## ========================================== + Coverage 86.16% 86.43% +0.27% ========================================== Files 75 75 Lines 3737 3893 +156 ========================================== + Hits 3220 3365 +145 - Misses 517 528 +11 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

mukundansundar commented 9 months ago

I have specifically not changed the examples/demo_workflow as that is still a validate pattern of registering and using workflows and that can co-exist with the decorator pattern.

bkcsfi commented 9 months ago

Hi,

I like where this is going but I have some suggestions that would make it easier for my use case.

I think it would be helpful to offer a decorator function that sets the function _registered_name attribute independent of having to instantiate a WorkflowRuntime.

All of our code explicitly passes an already instantiated WorkflowRuntime instance from the top-most level to registration functions in lower-level modules. This way we only instantiate a WorkflowRuntime once, at the top level, and not on every module import. Being able to set _registered_name separately from the registration process would match our current architecture.

Then register_workflow() and register_activity() could be adjusted to detect when the passed function already has a _registered_name attribute and use that if the name parameter had not been set. If neither name is passed nor _registered_name is found, then default to function __name__

mukundansundar commented 8 months ago

Hi,

I like where this is going but I have some suggestions that would make it easier for my use case.

I think it would be helpful to offer a decorator function that sets the function _registered_name attribute independent of having to instantiate a WorkflowRuntime.

All of our code explicitly passes an already instantiated WorkflowRuntime instance from the top-most level to registration functions in lower-level modules. This way we only instantiate a WorkflowRuntime once, at the top level, and not on every module import. Being able to set _registered_name separately from the registration process would match our current architecture.

Then register_workflow() and register_activity() could be adjusted to detect when the passed function already has a _registered_name attribute and use that if the name parameter had not been set. If neither name is passed nor _registered_name is found, then default to function __name__

I believe that the name a function registers with for a workflow should be provided only during the register call or be the name of the function itself. I do not think we should add a separate decorator for adding a _registrered_name attribute.

Instead is there any specific scenario where the optional name cannot be set with the register_workflow() or register_activity() method itself?

bkcsfi commented 8 months ago

Hi @mukundansundar

I believe that the name a function registers with for a workflow should be provided only during the register call or be the name of the function itself. I do not think we should add a separate decorator for adding a _registrered_name attribute.

Please consider my current project, which integrates multiple systems together. Each integration is in its own module, and each integration has its own set of workflows and activities. Most integrations have the same basic activities:

etc.

A separate higher-level orchestration module combines workflows and activities together as needed.

In that higher-level module, it's much cleaner to write code like this:

    from wms_integration.dapr.activities import generic_order as generic_order_act
...
    result = yield ctx.call_activity(generic_order_act.parse_file, input)

But because there are multiple parse_file methods in different modules, I cannot name activities and workflows strictly on what they do within the context of their enclosing module, Instead I have to artificially pad out the name to avoid naming collisions due to the way durable_task/ext_workflow use the function __name__.

So I end up with code like this with ugly and long names:

   result = yield ctx.call_activity(
                generic_order_act.generic_order_parse_order_file_act,
                input,
   )

That is why I do not want to use the name of the function itself.

Instantiating a WorkflowRuntime in every module, just so that the registered name can be set by a decorator, is problematic.

Imports should not have side effects, but creating a WorkflowRuntime creates a GrpcEndpoint and a TaskHubGrpcWorker. Every module would have to know the correct host and port to use (which is not always going to be localhost nor the default port).

Instead is there any specific scenario where the optional name cannot be set with the register_workflow() or register_activity() method itself?

Certainly you can and should support optionally setting the name during the register call. But it would be better to ALSO support setting the name with a decorator (a decorator that does not require an existing WorkflowRuntime due to the side effect issues pointed out above).

The reason to use a decorator to set the registered name is so that all concerns related to the function are kept together in its declaration, in one spot: (the python function name, call signature, docstring and dapr registered name).

In my use case, I have modules with only activities. Suppose those modules are 1000 lines long. At the bottom of the module is a function that is called by higher-level code, which passes in WorkflowRuntime so that this module's activities can be registered:

def register_activities(wf_runtime: WorkflowRuntime):
    """register activities"""
    wf_runtime.register_activity(generic_order_parse_order_file_act)
    wf_runtime.register_activity(generic_order_import_order_act)
    wf_runtime.register_activity(generate_email_request_from_summary_act)

But my function declaration is 950 lines higher up in the code, far away from where the dapr registered_name is set. That's not great. Those concerns would be far removed from each other. Maybe this isn't too important for activities, but for workflows it is important because other developers will want to look at the workflow function definition and quickly determine a) the python name they can use to call the workflow directly, or b) the dapr name to use to instantiate a workflow by registered name.

Having those two values far apart just increases developer workload.

In summary, I did not ask you remove any functionality that you have already implemented. I merely proposed that you add support to register_x() so that it doesn't overwrite an already existing _registered_name (unless a distinct name was explicitly passed to the register function, though maybe that should raise an exception instead) and also that you make a decorator available distinct from WorkflowRuntime. You can still have a decorator on WorkflowRuntime if you want to use it that way.

Thanks for reading this far..

mukundansundar commented 8 months ago

@bkcsfi I have modified to add an alternate_name decorator which adds an _alternate_name attribute to the function which can be used instead of __name__ ... Please let us know if this works for your use case.

bkcsfi commented 8 months ago

@mukundansundar Hi thank you for revising the decorator and register functions. The changes meet all my needs.

Do you have an existing issue or discussion for The serialize/deserializer addition will be done as a separate PR. A separate issue will be created to track it. ?

I am using some ugly approaches to try to keep activity and workflow 'input deseralization' and 'output serialization' tied to activity and workflow function declarations, rather than forcing the caller to do that work.

I would appreciate seeing what you're planning there, as well as maybe getting durable task to send() deserialized activity results to the workflow..

Thanks, -Brad

mukundansundar commented 8 months ago

@mukundansundar Hi thank you for revising the decorator and register functions. The changes meet all my needs.

Do you have an existing issue or discussion for The serialize/deserializer addition will be done as a separate PR. A separate issue will be created to track it. ?

I am using some ugly approaches to try to keep activity and workflow 'input deseralization' and 'output serialization' tied to activity and workflow function declarations, rather than forcing the caller to do that work.

I would appreciate seeing what you're planning there, as well as maybe getting durable task to send() deserialized activity results to the workflow..

Thanks, -Brad

I will be creating an issue following a comment/discussion from you in discord.