dhiaayachi / temporal

Temporal service
https://docs.temporal.io
MIT License
0 stars 0 forks source link

Do not record activity input in the workflow history #356

Open dhiaayachi opened 2 months ago

dhiaayachi commented 2 months ago

Is your feature request related to a problem? Please describe. Activity inputs consume history space and are not used when recovering workflow state through the replay.

Describe the solution you'd like Do not persist activity inputs; in case of failures, reconstruct them by querying or replaying workflow. Assuming that workflow code is deterministic, the inputs can be reconstructed at any time. The same mechanism also can be used to show inputs in UI.

Describe alternatives you've considered Store inputs separately from the history and don't return them by default when replaying workflows.

Additional context Customer using query called by each activity as a workaround:

We have a workflow where we pass the JSON (Object) in all those activities (assume 20) to perform operations. It increases the DB size as well as the history cache. If in the activities we can get it using the stub query it will not be the part of the every activity registered with workflow and inside activity we can query from workflow. Is that the right approach to go with ?

dhiaayachi commented 1 month ago

Thank you for your feature request.

It's a great suggestion to optimize history space usage and workflow state recovery by not persisting activity inputs. However, Temporal currently doesn't have this feature out of the box.

Here's how you can work around this limitation:

We appreciate your feedback and will consider this feature for future development.

dhiaayachi commented 1 month ago

Thank you for reporting this issue.

You are correct that activity inputs are persisted in the history, and while they are not used during workflow replay, they do increase the history size.

The workaround you described, using a query to reconstruct activity inputs, is a valid approach. However, it is important to consider the following:

Temporal currently does not have a built-in feature to avoid persisting activity inputs in the history.

The alternative you suggested, storing inputs separately from the history, is a possible solution. You could implement a custom mechanism to store and retrieve activity inputs.

However, if you are experiencing high history size, a more efficient approach might be to analyze your workflow and optimize the size of activity inputs.

For example, you could:

dhiaayachi commented 1 month ago

Thank you for reporting this feature request. This is a great suggestion.

While we don't have the option to not persist activity inputs, you can certainly work around this by fetching data using a query within the activity itself.

For example, you can use a stub query within the activity to fetch the JSON object instead of passing it as input. This way, the JSON object won't be part of the activity history, minimizing history space consumption. This approach aligns with the principle of keeping your workflow code deterministic, as the activity can always reconstruct the necessary data.

You can learn more about how to execute queries from your workflows here: https://docs.temporal.io/workflows/query