Closed jdevillard closed 1 week ago
Based on the context provided, it seems you are looking to improve the performance of Elsa Workflows by addressing the issue of redundant data storage, particularly in the WorkflowExecutionLogRecords
table. The SerializedPayload
column in this table stores a large amount of data, which can lead to rapid database growth and potentially impact performance due to the size of the data being stored and transferred.
To address this issue, you are proposing to remove redundant data from the SerializedPayload
of the WorkflowExecutionLogRecords
table. This would involve not storing the entire journal data for each activity, which can include outcomes, exceptions, and other journalization data.
As an alternative, you've considered allowing users to configure what gets stored in the journal, particularly for activities that create outputs. This could be managed through the Persistence Log Record Module, where users can select which outputs to persist.
The impact of this enhancement would be significant storage savings and improved performance due to reduced data transfer to the persistence store. You've provided a benchmark showing that the size of a record could be reduced significantly, from 683 bytes to 505,923 bytes, in the case of an HTTP sample.
To proceed with this enhancement, you would need to evaluate the current implementation and identify where the redundancies occur. Then, you would need to modify the logging mechanism to either remove the redundant data or provide configuration options for users to select what data should be stored.
This change would affect any use case where activities generate outputs that are currently being stored in the SerializedPayload
. Activities like Elsa.FlowSendHttpRequest
, which add parsed content of HTTP calls to the journal, would be particularly affected.
In summary, the proposed enhancement aims to optimize data storage and improve performance by reducing the amount of redundant data stored in the WorkflowExecutionLogRecords
table. This would involve changes to the logging mechanism and potentially adding user-configurable options for data persistence.
This comment was generated by Glime.
closed as done with #5266
Performance Improvement Request
Performance Issue Overview
Is your performance issue related to a specific functionality? Please describe. This performance issue is related to the Storage Capacity of the LogRecord for each activity. The database can growth quickly with some data that are not used or redundant with other data.
Proposed Enhancement
Remove redundant Data of the
SerializedPayload
of theWorkflowExecutionLogRecords
tableThe SerializedPayload column store the Payload Data which is composed of all the Journal Data of the Activity. This Journal Data can be composed of :
Alternative Solutions
Describe alternatives you've considered Have you identified any potential fixes or tweaks yourself? Please share your findings, including why they might not have been a perfect or complete solution. This helps us understand the issue better and consider all possible solutions.
Affected Use Cases
Identify affected use cases The use of the
Elsa.FlowSendHttpRequest
add all the parsed content of the Http call in the journal. This can be very consequent.But Any activity that create some output will add data to the journal.
Impact of Enhancement
Explain the potential impact avoid to much storage, enhance performance due to less data in network call to the persistence store.
Benchmarks and Metrics
Provide any relevant benchmarks or metrics
For example, in my Http Sample, the cost of a record could go from 683 bytes to 505 923 bytes . And this is visible for all activities that create some output.
Additional Context
Further though
It could be useful to also allow selection to what can be stored using the journal. (output are already chosen with the Persistence Log Record Module and stored in the SerializedActivityState