slashmo / gsoc-swift-baggage-context

BaggageContext library, extracted from slashmo/gsoc-swift-tracing
https://github.com/slashmo/gsoc-swift-tracing
Apache License 2.0
4 stars 2 forks source link

Design composable Carrier protocols #22

Closed ktoso closed 3 years ago

ktoso commented 3 years ago

Library authors expressed the wish to use composable Carrier protocol types, so let's explore this.

It is a way to express that a context "has to" provide some values, it is most notably used today for:

We need to be careful with those however...


Libraries SHOULD only expect shared standard carrier types. Libraries MUST NOT invent their own carrier protocols,

typealias MyLibContext = EventLoopPreferenceCarrier | LoggerCarrier | BaggageCarrier func get(..., context: MyLibContext)

is ok, because any framework or other library is able to pass its own context:

let context: Lambda.Context
get(..., context: context)

while the following is NOT:

struct MyContext: EventLoopPreferenceCarrier | LoggerCarrier | BaggageCarrier func get(..., context: MyLibContext)

because it is a real type and thus it is not possible to:

let context: Lambda.Context
get(...: context: oh no)

More notes to be done here...

Please note to NOT USE carriers as general throw everything in there. To be honest I'm also really in love with about the event loop being in the context, as it can absolutely be passed explicitly. Let's discuss this though. There should NOT be more protocols for carriers then a few "really core" things. EventLoops today are everywhere, thus we say it may be okey to specialize those with a carrier as well but we highly suspect those may be going away once a swift concurrency story would unfold?


Frameworks MAY define their own context types when they hand them out to users, but should do their best to accept the widely generic carriers for purposes of tracing and contex propagation without incuring too much noise on end users.

Frameworks SHOULD accept the most general carrier type possible for APIs, specifically for carrying contexts across boundaries -- e.g. "making requests" and IPC calls etc. Otherwise say when "framework specific reasons" are needed then of course passing the framework context is okey.


TODO: this needs more notes and is a rough outcome of a sync up we had with @tanner0101, @tomerd

tanner0101 commented 3 years ago

Libraries SHOULD only expect shared standard carrier types. Libraries MUST NOT invent their own carrier protocols,

Agreed. For completeness, here's an example of this pattern that compiles:

protocol BaggageCarrier { /* ... */ }
protocol LoggingCarrier { /* ... */ }

typealias MyContext = BaggageCarrier & LoggingCarrier

Please note to NOT USE carriers as general throw everything in there. To be honest I'm also really in love with about the event loop being in the context, as it can absolutely be passed explicitly. Let's discuss this though.

I agree EventLoop preference is not a perfect fit here as it's not one of the pillars of observability. But I think pragmatically it makes a lot of sense. EL pref, like logger and baggage, is not directly related to "normal" method arguments and needs to be passed to every function. So they have a lot in common.

Frameworks MAY define their own context types when they hand them out to users, but should do their best to accept the widely generic carriers for purposes of tracing and contex propagation without incuring too much noise on end users.

This is an interesting point to consider. For reference, Vapor will likely have two (base) context implementations:

(Other packages may declare new contexts, like vapor/queues with its QueueContext.)

There will definitely be methods that want, for example, a request in particular. Take FileIO.streamFile(at:). This accesses the request's headers to determine whether the requested file is cached. It also needs context for logging and EventLoop preference (and maybe tracing in the future?).

Would something like this accept two separate parameters:

app.get("file") { req in
    streamFile(at: "/path/to/file", for: req, context: req)
}

Or just assume request and context are one:

app.get("file") { req in
    streamFile(at: "/path/to/file", for: req)
}

Maybe the best thing here would be two overloads? One where Request is both the context and the provider of HTTP headers and one where these are separate.

For now, Vapor is using the protocol-based context pattern here anyway (req.fileio.streamFile(at:) is the current API), but I think other frameworks will run into a similar issue.

Mordil commented 3 years ago

I'm still a little confused as to what baggage context even is as a concept. As a library author, to me, "baggage context" is the event loop, the logger, the trace ID, etc. It's all baggage information about the runtime execution context of this method being invoked that I want to pass around (or in some cases use, like the logger)

As I understand it right now, I'd need a few parameters when supporting both logging and tracing (which may or may not be conformed to by a single type/instance, but the method can't express that: foo(/* params */, loggingContext: LoggingBaggageCarrier, tracingContext: TracingBaggageCarrier

Why can't that be expressed as a single baggageContextCarrier?


Primarily what I'm getting at is that I don't like the idea of "opt-out by method call" to certain context providing, such as logger or tracer. I'm in favor of "opt-in" by providing the values in a single context.

I, as an author, say: I accept a "baggageContext" parameter. If you provide a logger, or a tracer, etc. I will use it and provide you information. If you don't, then I will either not provide any data or will use my own.

tomerd commented 3 years ago

@Mordil I suspect the confusion may be related to the naming - see https://github.com/slashmo/gsoc-swift-baggage-context/issues/23

Here is a summary of some of the discussion @ktoso, @tanner0101 and myself had the other day and what we want to explore in this thread

From a pure tracing perspective, the thing we care about the most is the baggage which is a "glorified dictionary" with the tracing information (traceId, parentSpanId, spanId, etc). This is now named BaggageContext which imho is confusing.

Then come the carriers. Those are closer to what most of us think of as context. In practice they are marker protocols that are designed to be used by libraries APIs to define what they need as context in a composable way. For example, instead of designing an API like

func foo(logger: Logger, baggage: Baggage, bar: String)

a library author can design the API to be

func foo(context: MyContext, bar: String)

having MyContext a typealias to BaggageCarrier & LoggingCarrier as defined above

This allows for composition - at the callsite users can pass context from one library as the context to another library - as long as they are all using one of known Carrier protocols

orthogonal to tracing, but fairly closely related from an API design point of view is the question of EventLoop carrier. Since many libraries need to also know the EventLoop preference (to avoid context switching from the server processing thread), it begs that EventLoop is also part of the Carrier pattern. For example,

func foo(eventLoop: EventLoop, logger: Logger, baggage: Baggage, bar: String)

a library author can design the API to be

func foo(context: MyContext, bar: String)

having MyContext a typealias to BaggageCarrier & LoggingCarrier & EventLoopCarrier as defined above

Mordil commented 3 years ago

@tomerd Ah yes. This is true and I hadn't had a chance to read that thread yet. (I read this one first as it was mentioned first)

You've summarized well what I'm beginning to understand, but what I'm trying to express is more of a basic dictionary of types where you provide a generic context object that exposes it, then in my library I ask if it has a Logger.self and Tracer.self object that I can use.

Having MyContext = BaggageCarrier & LoggingCarrier & EventLoopCarrier doesn't allow me to express the opt-in feature, it requires each of those to be conformed to the type the user provides... unless I split them into separate parameters or with overloaded methods

Example of what I'm sort of hoping for:

func doSomething(with param: String, context: Context) {
  let log = context[Logger.self] ?? myDefaultLogger
  let span = context[TracerSpan.self]
  log.trace("starting \(#function)")
  span?.start()
  defer { span?.stop() }
}
tomerd commented 3 years ago

@Mordil can you elaborate what you mean by

Having MyContext = BaggageCarrier & LoggingCarrier & EventLoopCarrier doesn't allow me to express the opt-in feature, it requires each of those to be conformed to the type the user provides... unless I split them into separate parameters or with overloaded methods

Mordil commented 3 years ago

@tomerd @ktoso I apologize for the really long delay in responding.

After looking at the final API proposal, this is approaching very closely to what I was desiring but I never fully articulated it.

Essentially what I was hoping for (and we basically have, at least in the Baggage case) is an Inversion of Control value container, as I feel this gives the most composeability.

This would allow call sites in framework agnostic libraries, such as RediStack:

func send(command: String, with args: [RESPValue], context: Context) -> EventLoopFuture<RESPValue> {
  if let logger = context[LoggerContextKey] { logger.trace("starting \(#function)") }

  // other stuff to get ready to write
  let promise = /* ... */
  let future = self.channel.write(command).map { promise.futureResult }

  if let eventLoop = context[EventLoopContextKey.self] {
    return future.hop(to: eventLoop)
  } else {
    return future
  }
}

This would leave individual packages up to define their own protocols such as LoggerContextCarrier or BaggageContextCarrier that have the basic requirement of:

protocol <Value>ContextCarrier: Context {
    var <value>: <Value> { get }
}

as well as their own key:

public enum <Value>ContextKey: ContextKey {
    typealias Value = <ValueType>
}

then, packages that choose to be context aware can compose and use these adhoc, as I mentioned above or constrain their requirements in a composeable fashion based on their own dependencies with the <Value>Carrier, rather than ones forced by this package.

Such as

func doSomething(context: EventLoopContextCarrier & LoggerContextCarrier) {
}

Frameworks can then provide a more concrete implementation:

struct RequestContext: Context, LoggerContextCarrier, EventLoopContextCarrier, BaggageContextCarrier {
    var baggage: Baggage { /* access storage to get baggage */ }
    var logger: Logger { /**/ }
    var eventLoop: EventLoop
}

and anyone that is framework aware could instead rely on RequestContext as a concrete type with all of those properties, rather than Context.

Mordil commented 3 years ago

@ktoso @tomerd Here is a proof of concept of what I was going for: https://github.com/ktoso/gsoc-swift-baggage-context/pull/1

Mordil commented 3 years ago

After a review from that proof of concept, I'm withdrawing my desire and tailoring my expectations accordingly.