Open abhineet13 opened 2 years ago
Execution metrics are only available in a physical plan while Spline's main focus is a logical one. There is no direct correlation between both, so we cannot project physical plan metrics to the logical operators. The agent does collect some high-level read and write metrics, but they are associated with the job execution event as a whole, not with individual operators. The maximum we can do is to preserve the original layout for those metrics as represented in the Spark physical plan (currently they are all combined).
{
"_created": 1640809837062,
"durationNs": 1000845670,
"execPlanDetails": {...},
"extra": {
"appId": "local-1640809828533",
"readMetrics": { // <-------------- Combined read metrics
"numOutputRows": 3
},
"writeMetrics": { // <-------------- Combined write metrics
"numFiles": 3,
"numOutputBytes": 2301,
"numOutputRows": 3,
"numParts": 0
}
},
"labels": {...},
"timestamp": 1640809836954
}
Thanks, combined read/write metrics will be helpful.
combined ones are already there
Background [Optional]
Similar to spark UI which shows count of records for each stage, would like to understand if counts can be available for spline lineage.
Question
Hi, Is it possible to get count of records for data sources and before/after transformations like filter, join?