apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.49k stars 1.02k forks source link

Add Caller Context to the ObjectStore Trait for tracing and diagnostic #3020

Open mingmwang opened 1 year ago

mingmwang commented 1 year ago

Is your feature request related to a problem or challenge? Please describe what you are trying to do. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] (This section helps Arrow developers understand the context and why for this feature, in addition to the what)

Some of the distributed storage systems like HDFS support API tracing. The tracing info will log to audit log in server side and help uses to better diagnostic and understand the interaction between systems.

https://community.cloudera.com/t5/Community-Articles/Providing-tracing-with-Spark-Caller-Context/ta-p/245806

When we run Ballista/DataFusion on HDFS, it is better that we can provide the same capabilities. Basically, the Caller context will keep the information like who(app/job/stage/task) is calling the the ObjectStore and pass through those information from the compute engine to storage engine.

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

alamb commented 1 year ago

@crepererum did something similar for tracing in https://github.com/apache/arrow-datafusion/pull/2940 which might be valuable

Also here wrapping a dyn Objectstore with some other struct that implements ObjectStore could be interesting. https://github.com/apache/arrow-datafusion/issues/3019#issuecomment-1203932916

crepererum commented 1 year ago

I'm actually facing a similar issue with IOx. I think the ObjecStore methods should retrieve an additional parameter ext: &Extensions which can carry opaque extensions around (similar to #2940). DataFusion can than use this to pass the current context to the object store, which can then extract tracing information.

mingmwang commented 1 year ago

@crepererum I think maybe we just need to add a parameter of type String ext: Option<String>. Users of the ObjectStore can feel free to encode any tracing info to strings and pass through to the storage system.

crepererum commented 1 year ago

Some objects cannot be encoded via strings, so users then build some own registry objects (probably some semi-global hashmap), which is worse. That's the whole reason I've added the extension system in #2940, because I wanted to avoid that.