splitgraph / seafowl

Analytical database for data-driven Web applications 🪶
https://seafowl.io
Apache License 2.0
386 stars 9 forks source link

Add Prometheus metrics to MemoryManager #518

Closed mildbyte closed 2 months ago

mildbyte commented 2 months ago

Make the metrics feature non-optional (there's a lot of code coming that exposes metrics and it'll get painful/pointless to maintain the ability to opt out of compiling metrics support at the cargo feature level

Add a special MemoryPool that wraps the default DataFusion MemoryPool (that setting defaults to GreedyMemoryPool, so construct that explicitly) and logs the total allocated/released bytes by each DataFusion query tree node.

Also replace all uses of RwLock<HashMap> to DashMap as @gruuya proposed.

Sample output:


# HELP seafowl_datafusion_memory_pool_reserved_bytes_current Current memory reserved in DataFusion's managed memory pool
# TYPE seafowl_datafusion_memory_pool_reserved_bytes_current gauge
seafowl_datafusion_memory_pool_reserved_bytes_current 0
# HELP seafowl_datafusion_memory_pool_freed_bytes_total Memory freed in DataFusion's managed memory pool
# TYPE seafowl_datafusion_memory_pool_freed_bytes_total counter
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="RepartitionExec"} 2033677121
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="NestedLoopJoinLoad"} 15072
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="TopK"} 5951438
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="GroupedHashAggregateStream"} 89032440
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="CrossJoinExec"} 21184
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="HashJoinStream"} 18346
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="AggregateStream"} 3230
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="SortPreservingMergeExec"} 2932891
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="ExternalSorter"} 3268145
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="HashJoinInput"} 112437164
seafowl_datafusion_memory_pool_freed_bytes_total{consumer="ExternalSorterMerge"} 120259084288

# HELP seafowl_datafusion_memory_pool_allocated_bytes_total Memory allocated in DataFusion's managed memory pool
# TYPE seafowl_datafusion_memory_pool_allocated_bytes_total counter
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="TopK",result="success"} 5951438
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="SortPreservingMergeExec",result="success"} 2932891
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="ExternalSorter",result="success"} 3268145
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="HashJoinStream",result="success"} 18346
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="ExternalSorterMerge",result="success"} 120259084288
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="HashJoinInput",result="success"} 112437164
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="NestedLoopJoinLoad",result="success"} 15072
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="AggregateStream",result="success"} 3230
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="GroupedHashAggregateStream",result="success"} 89032440
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="RepartitionExec",result="success"} 2033677121
seafowl_datafusion_memory_pool_allocated_bytes_total{consumer="CrossJoinExec",result="success"} 21184