BlazingDB / blazingsql

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
https://blazingsql.com
Apache License 2.0
1.93k stars 183 forks source link

Revamping Loggers #1297

Closed wmalpica closed 3 years ago

wmalpica commented 3 years ago

The following is a summary of all the logs we have, what they log and what are their deficiencies:

input_comms/output_comms

queries_logger

kernels_logger

kernels_edges_logger

events_logger

cache_events_logger

batch_logger

Information we would like to be able to know

PROPOSED ACTIONS

task_logger Task logger would log tasks executed by the executor. It should only need to be in task::run(...):

In this alternate proposal, then we would be able to analyze time spent in all processing, caching (via cache_events_logger) and decaching (via task_logger). In this proposal, we would not be able to explicitly track every batch how it goes in and out and to where it come from and goes to. That is information that could be useful in a very very detailed debugging analysis. We have right now a little more of that info, but not by a lot.

aucahuasi commented 3 years ago

https://github.com/BlazingDB/blazingsql/issues/1305

aucahuasi commented 3 years ago

https://github.com/BlazingDB/blazingsql/issues/1306

wmalpica commented 3 years ago

Regarding:

Improve cache_events_logger by making sure all caching events are logged (adding to cacheMachines) and when we downgrading data (MemoryMonitor). I dont think we need to log when we pull out of cacheMachines.

Downgrading data is currently captured by batch_logger in CacheMachine.cpp in the function downgradeCacheData. We dont want batch_logger to capture that anymore. We want cache_events_logger to capture that somehow.