onsi / ginkgo

A Modern Testing Framework for Go
http://onsi.github.io/ginkgo/
MIT License
8.09k stars 643 forks source link

Proposal: RunSpecsWithContext? #1366

Closed maxcleme closed 4 months ago

maxcleme commented 4 months ago

Hi!

We are currently starting end-to-end (e2e) tests programmatically using RunSpecs. While this works great, we were wondering if introducing something like RunSpecsWithContext would also make sense? It's a very common Go idiom, right? :D

Right now, we have no way to dynamically cancel the operation (we can define a timeout at the suite level, but that's it). This is why having RunSpecsWithContext would have been handy. Use cases like starting the suite from an HTTP call don't really play nice without being able to set the parent context. There is also the case concerning OpenTelemetry traces; it would be nice to be able to create a root span and nest all the tests under it.

I'd like to gauge interest in the idea before starting to hack something. I would assume the work wouldn't be really hard, just requiring many signature changes.

onsi commented 4 months ago

@maxcleme this is interesting - it sounds like you are invoking a Ginkgo suite outside of the context of a go test suite or a cli invocation (e.g. go test or ginkgo). Can you share a bit more about how you're doing that? Some Ginkgo features assume the CLI is playing a coordinating role (e.g. parallel specs). How does that work for you?

I wanted to put a Ginkgo suite on the other end of an HTTP interface I would invoke the ginkgo cli in my HTTP handler with appropriate configuration and interrupt the process when the context gets cancelled. How are you approaching it?

Also - I'm not sure why there would be many signature changes. Various Ginkgo nodes can accept a context today and Ginkgo will automatically generate one for you and control the timeout at the level of the node and spec. Since go tests (and, therefore Ginkgo tests) run from the command line these contexts can be cancelled by sending SIGINT. Lots of detail here: https://onsi.github.io/ginkgo/#spec-timeouts-and-interruptible-nodes please take a look and let me know how you'd see RunSpecsWithContext interacting with the behavior outlined in the docs at that link.

maxcleme commented 4 months ago

@maxcleme this is interesting - it sounds like you are invoking a Ginkgo suite outside of the context of a go test suite or a cli invocation (e.g. go test or ginkgo). Can you share a bit more about how you're doing that? Some Ginkgo features assume the CLI is playing a coordinating role (e.g. parallel specs). How does that work for you?

Indeed, we are operating within go test (compiling everything and then executing the binary independently, though the distinction might not be significant). Before invoking RunSpecs, we execute additional tasks and are exploring options to programmatically halt execution from here. Consequently, we're considering supplying a single parent context to Ginkgo, which would then generate all subsequent contexts from this initial one instead of using context.Background.

I wanted to put a Ginkgo suite on the other end of an HTTP interface I would invoke the ginkgo cli in my HTTP handler with appropriate configuration and interrupt the process when the context gets cancelled. How are you approaching it?

Given that we're adjusting the configuration programmatically and employing a custom GomegaFailHandler, directly utilising the ginkgo CLI might not be feasible. Additionally, this approach wouldn't allow the propagation of context values, such as capturing OpenTelemetry traces.

Also - I'm not sure why there would be many signature changes. Various Ginkgo nodes can accept a context today and Ginkgo will automatically generate one for you and control the timeout at the level of the node and spec. Since go tests (and, therefore Ginkgo tests) run from the command line these contexts can be cancelled by sending SIGINT. Lots of detail here: https://onsi.github.io/ginkgo/#spec-timeouts-and-interruptible-nodes please take a look and let me know how you'd see RunSpecsWithContext interacting with the behavior outlined in the docs at that link.

When a context passed to RunSpecWithContext is terminated or cancelled, I expect that the resulting behavior should mark the nodes as 'Interrupted', in a manner similar to what happens when the suite's configured timeout is reached. Importantly, for any node defined within the suite, such as It("something", func(ctx context.Context){}), I anticipate that the ctx provided to each node would be a child context derived from the initial context supplied to RunSpecWithContext. This approach is crucial for maintaining the continuity of context properties, ensuring that we do not default to context.Background and lose the propagation of context-specific data, such as tracing information.

onsi commented 4 months ago

Thanks for the additional detail. I'd like to keep exploring the setup with you before we land on what the best solution might be:

Before invoking RunSpecs, we execute additional tasks and are exploring options to programmatically halt execution from here.

Ginkgo supports this sort of setup via BeforeSuite and - crucially - SynchronizedBeforeSuite for projects that run in parallel. Are there reasons you can't perform the setup within a *BeforeSuite?

...programmatically halt execution from here.

Simpy failing in a BeforeSuite would do this for you.

directly utilising the ginkgo CLI might not be feasible

If you can run it with go test you can run it with ginkgo - so I'm wondering if I'm misunderstanding your setup.

The context-propagation behavior you are describing makes sense to me. My only concern is with putting it in RunSpecsWtihContext as that pattern would promote putting set up around RunSpecs instead of using Ginkgo's (parallel-specs-friendly and integreated/managed) mechanisms for setup and teardown.

maxcleme commented 4 months ago

Thanks for taking the time to follow-up on this!

Ginkgo supports this sort of setup via BeforeSuite and - crucially - SynchronizedBeforeSuite for projects that run in parallel. Are there reasons you can't perform the setup within a *BeforeSuite?

We could indeed perform all the setup in the BeforeSuite; it would require us to store some states in global variables instead of relying on the natural closure scope (because we reuse some stuff created at setup in AfterSuite/ReportAfterSuite nodes), but yes, I admit we could do it, no real blockers here.

Simpy failing in a BeforeSuite would do this for you.

Yes of course, but this is only working if we want to abort between the start of the program and the BeforeSuite, once the BeforeSuite is done we can't abort the suite from the entrypoint (the function calling RunSpecs). Or we could have a dummy BeforeEach that looks somewhere if we got aborted and Skip accordingly but that's seems far-stretched to me. Also, it won't properly handle the ginkgo grace period as the node won't be considered interrupted.

If you can run it with go test you can run it with ginkgo - so I'm wondering if I'm misunderstanding your setup.

Maybe we could then! But it would require us to install ginkgo cli on our CI nodes instead of only being able to execute one standalone binary. But that's side-tracking, wherever we use go test, ginkgo or something compiled already doesn't really solve the need, right?

I assume one way of solving it would be to have a bootstrap piece that called our compiled test binary, while this would handle cancellation (by sending signal to the sub-process), it won't handle context values propagation for tracing.

The context-propagation behavior you are describing makes sense to me. My only concern is with putting it in RunSpecsWtihContext as that pattern would promote putting set up around RunSpecs instead of using Ginkgo's (parallel-specs-friendly and integreated/managed) mechanisms for setup and teardown.

Yeah, I get that point, but then what would be the right place to define one "parent" context somehow? From my point of view, it has to be somehow defined "outside" otherwise there is no way we could derive everything from it. Even if we manage to define it at the soonest with a new primitive (eg, in BeforeSuite); we would still need to create the context of that BeforeSuite with something else (like right now, context.Background()). Looks like chicken-egg problem if we try to define it post-RunSpecs.


In summary, we could refactor our codebase to follow the guideline you're mentioning here, but it won't really solve the context values propagation issue (eg: tracing). Maybe we should focus the discussion around that aspect instead? If we can solve it then it also makes the other issue way easier to solve as well. Wdyt?

onsi commented 4 months ago

hey @maxcleme - thanks for engaging with my questions! I hope I'm not coming across as resistant - just want to figure out what the actual underlying problem you are trying to solve is and to explore how Ginkgo-as-is can help you before we decide we need to add new things.

I agree we should focus on the tracing piece. I'll do that down below. First a couple more comments:

it would require us to store some states in global variables instead of relying on the natural closure scope (because we reuse some stuff created at setup in AfterSuite/ReportAfterSuite nodes)

yes, that's a thing for sure - though the stuff in AfterSuite would not need to be pulled out as you can simply use DeferCleanup in the BeforeSuite block.

but this is only working if we want to abort between the start of the program and the BeforeSuite, once the BeforeSuite is done we can't abort the suite from the entrypoint

I'm finding this confusing and it may be pointing to something about your usecase that I don't understand yet. It sounds like you are saying there's some resource or process external to the suite that needs to be able to cancel the suite at any point in time. But that you can't cancel the suite using SIGINT. What is that external something? Why is there something external to the test that determines whether or not the test should abort? I often find that I express those things internal to the test. For example, if I'm relying on a database then my tests will naturally fail if the db goes down or enters a bad state. If I want the suite to abort if a known external condition occurs (e.g. the cluster I'm testing with has crashed and disappeared) i'll often add a check in BeforeEach that will call AbortSuite() if needed)


OK let's dig into tracing. Can you share how your tracing library works? When do you make contexts and spans. What is their hierarchy and how do you think about it wrt Ginkgo's spec hierarchy? What does interruption/cancellation look like and what triggers it? What are you trying to do with the trace data (e.g. debug a test failure when it occurs? aggregate performance information tagged by spec to look for patterns?)

maxcleme commented 4 months ago

OK let's dig into tracing. Can you share how your tracing library works? When do you make contexts and spans. What is their hierarchy and how do you think about it wrt Ginkgo's spec hierarchy? What does interruption/cancellation look like and what triggers it? What are you trying to do with the trace data (e.g. debug a test failure when it occurs? aggregate performance information tagged by spec to look for patterns?)

We utilize go.opentelemetry.io/otel for tracing. Spans are declared as follows:

func operation(ctx context.Context) {
    var span trace.Span
    ctx, span = tracer.Start(ctx, "operation") // Depending on ctx, this will either create a root-level span or nest it under the current one.
    defer span.End()
    // ...
}

We can generate a span for each test by wrapping ginkgo.It, here is simplified version:

func It(text string, f func(ctx context.Context)) bool {
    return g.It(text, function(ctx context.Context) {
        ctx, span := tracer.Start(ctx, CurrentSpecReport().FullText())
        defer span.End()
        f(ctx)
    })
}

This approach works to some extent; however, all the tests appear as top-level spans because they all use context.Background() as their parent. Ideally, we would like them to be nested under the same root span, as illustrated below:

- Test Suite X (root span)
  - BeforeSuite
  - Test A
    - BeforeEach
    - Test A
      - HTTP GET https://foo
      - HTTP GET https://bar
    - AfterEach
  - Test B
    - BeforeEach
    - Test B
      - HTTP GET https://fizz
      - HTTP GET https://buzz
    - AfterEach
  - AfterSuite

To achieve this hierarchical structure, we need to create a span before executing anything related to Ginkgo and then ensure Ginkgo uses this specific context (the one returned by tracer.Start()) as the parent for everything.

We aim to use these traces to provide a visual understanding of what a test is performing. Additionally, since our product supports traces, we will also capture details under each HTTP request automatically, as long as we use the context provided in the It node of course.

onsi commented 4 months ago

hey - cool to see you wrapping the DSL. I think that works well for use-cases like this if you don’t want to repeat yourself. Are you only instrumenting the Its or do you also instrument the BeforeEach nodes, etc.?

A way to accomplish what you’re doing here (and you can imagine variations of this):

var traceSpec func(context.Context, string, …tracer.SpanStartOptions) (context.Context, tracer.Span)

BeforeSuite(func(bsCtx context.Context) { //or SynchronizedBeforeSuite if you have some shared setup and want to support parallel specs
    tracer := otl.NewTracer() //or whatever
    rootCtx, rootSpan := tracer.Start(context.Background(), “Test Suite”)
    DeferCleanup(rootSpan.End)

    traceSpec = func(ctx context.Context, spanDescription string, opts …tracer.SpanStartOptions) (context.Context, tracer.Span) {
       ctx = tracer.ContextWithSpan(ctx, rootSpan) //attach the rootspan to context
       return tracer.Start(ctx, spanDescription, opts…) //and now spawn a new span off of context
   } 

   ctx, span := traceSpec(bsCtx, “Before Suite”)
   defer span.End()
   //… do rest of BeforeSuite setup with ctx to get it traced
})

func It(text string, f func(ctx context.Context)) bool {
    return g.It(text, function(ctx context.Context) {
        ctx, span := traceSpec(ctx, CurrentSpecReport().FullText())
        defer span.End()
        f(ctx)
    })
}

this will give you a span for the It:

- Test Suite X (root span)
  - BeforeSuite
  - Test A It
      - HTTP GET https://foo
      - HTTP GET https://bar
  - Test B It
      - HTTP GET https://fizz
      - HTTP GET https://buzz

if you added similar wrappers to BeforeEach, DeferCleanup, and AfterEach you would get:

- Test Suite X (root span)
  - BeforeSuite
  - BeforeEach
  - Test A It
      - HTTP GET https://foo
      - HTTP GET https://bar
  - AfterEach
  - BeforeEach
  - Test B It
      - HTTP GET https://fizz
      - HTTP GET https://buzz
   - AfterEach 

but you wouldn’t get a span that covers the entire spec and has the Before/After/It etc. nested under it. In fact, even adding RunSpecsWithContext would not solve that particular problem. But I think I see a way to do it with the existing Ginkgo semantics and what otel gives you out of the box. I can share that in more detail if you’d like.

maxcleme commented 4 months ago

Thank you for the detailed answer! To be 100% transparent, that's really close to what we are already doing to achieve such a span hierarchy. For now, we are only wrapping the It function, but a similar approach can be applied to all other nodes of course.

but you wouldn’t get a span that covers the entire spec and has the Before/After/It etc. nested under it. In fact, even adding RunSpecsWithContext would not solve that particular problem. But I think I see a way to do it with the existing Ginkgo semantics and what otel gives you out of the box. I can share that in more detail if you’d like.

I must admit, this is the part where we were assuming RunSpecsWithContext would solve it. Could you explain a bit more why it wouldn't address the root-span issue? Anyway, if you have an alternative workaround to capture that root-span, I'm all ears!

onsi commented 4 months ago

I must admit, this is the part where we were assuming RunSpecsWithContext would solve it. Could you explain a bit more why it wouldn't address the root-span issue?

Sure: in Ginkgo each node gets its own context that lasts for the lifetime of the node. There isn't a separate context that lasts for the lifetime of the spec. If we inject a root context and then append each node to it we end up with:

- Test Suite X (root span)
  - BeforeSuite
  - BeforeEach
  - Test A It
      - HTTP GET https://foo
      - HTTP GET https://bar
  - AfterEach
  - BeforeEach
  - Test B It
      - HTTP GET https://fizz
      - HTTP GET https://buzz
   - AfterEach 

Note that the BeforeEach/AfterEach are not nested under a Test A span. they are simply hanging out under the root span. This is different from what you'd originally described:

- Test Suite X (root span)
  - BeforeSuite
  - Test A
    - BeforeEach
    - Test A
      - HTTP GET https://foo
      - HTTP GET https://bar
    - AfterEach

What governs the start and end of Test A? That would be missing and you wouldn't get it for free with RunSpecsWithContext(). I'll put together an example that I think will work and share it next.

onsi commented 4 months ago

ok, it's a bit of code - but that may be the nature of taking two hierarchical things (spans and Ginkgo) and mushing them together. First I'd create a new helper type (this could live in its own package so you could reuse it between different specs):

type Spanner struct {
    lock *sync.Mutex
    tracer otl.Tracer
    span *otl.Span
}

func NewSpanner(tracer orl.Tracer) *Spanner {
    return &Spanner{
        lock: &sync.Mutex{},
        tracer: tracer,
    }
}
//starts a new span that lasts the lifecycle of the passed-in context
func (s *Spanner) Push(ctx context.Context, spanDescription string, opts …tracer.SpanStartOptions) context.Context {
    s.lock.Lock()
    defer s.lock.Unlock()

    parentSpan := s.span
    if parentSpan != nil {
        ctx = tracer.ContextWithSpan(ctx, parentSpan) //attach the current span to context
    }
    ctx, childSpan := tracer.Start(ctx, spanDescription, opts…) //push the new span on
    s.span = childSpan //update the current span
    context.AfterFunc(ctx, func() {
        s.lock.Lock() //ugly, but necessary since context.AfterFunc runs in a goroutine
        defer s.lock.Unlock()
        childSpan.End()
        s.span = parentSpan //restore the parent span
    })
    return ctx
}

this thing knows how to push and pop stacks onto contexts. rather than pass around a mess of context and spans you just have a single spanner that your suite uses like so:

var spanner *Spanner
g.BeforeSuite(func(bsCtx context.Context) { //or SynchronizedBeforeSuite if you have some shared setup and want to support parallel specs
    spanner = NewSpanner(tracer)

    //set up the overarching span for the suite
    suiteCtx, cancel := context.WithCancel(context.Background())
    g.DeferCleanup(cancel)
    spanner.Push(suiteCtx, "Test Suite")

    bsCtx = spanner.Push(bsCtx, "BeforeSuite")
   //… do rest of BeforeSuite setup with bsCtx to get it traced
})

// a single BeforeEach that sets up a span for the spec and sets up its span
g.BeforeEach(func() {
    //set up the overarching span for the spec
    specCtx, cancel := context.WithCancel(context.Background())
    g.DeferCleanup(cancel) //runs at the end of this spec, automatically cleans up the span
    ctx = spanner.Push(ctx, CurrentSpecReport().FullText()) //push the span for the spec
}

func BeforeEach(f func(ctx context.Context)) {
    g.BeforeEach(func(ctx context.Context) {
        f(spanner.Push(ctx, "[BeforeEach]")) //push the span for the BeforeEach
    })
}

func AfterEach(f func(ctx context.Context)) {
    g.AfterEach(func(ctx context.Context) {
        f(spanner.Push(ctx, "[AfterEach]")) //push the span for the AfterEach
    })
}

// can add a similar wrapper for DeferCleanup but we'd need to chat about that

func It(text string, f func(ctx context.Context)) bool {
    return g.It(text, function(ctx context.Context) {
       f(spanner.Push(ctx, "[It] " + text))
    })
}

This shows how we make a suite level context, a spec level context, and then take the ginkgo generated node-level contexts and string everything together with the trace correctly using Spanner. Note that this leverages the new context.AfterFunc in Go 1.21. If you can't use that you can adapt the code to use one of many context-merging libraries out there to accomplish the same thing. what's nice with this approach is all the span logic is in spanner and the lifecycle of the span is completely tied to the lifecycle of the context. Ginkgo guarantees the node contexts are cancelled when the node ends.

maxcleme commented 4 months ago

Thank you very much for taking the time to hack this; it's truly a clever implementation. To be honest, we are currently merging contexts, which does the job, but it doesn't feel as clean as your solution! We can use Go 1.21, so it shouldn't be an issue to implement the same strategy somehow.

I would say that I now have all the building blocks required to fulfill my needs, so I'm fine if you want to close this issue. However, you didn't answer why RunSpecsWithContext() wouldn't have worked? 😁

onsi commented 4 months ago

hopefully not too clever! I hope it works for y'all and solves the underlying problem.

all I was saying with RunSpecsWithContest() is that it would give you that root suite context. but it would not give you a fresh spec context for each spec. The traces would look like:

- Test Suite X (root span)
   ...
  - BeforeEach
  - Test A It
      - HTTP GET https://foo
      - HTTP GET https://bar
  - AfterEach
   ...

now you'll get

- Test Suite X (root span)
   ...
  - Test A  // <== this level of nesting for the whole spec
    - [BeforeEach]
    - [It] Test A
      - HTTP GET https://foo
      - HTTP GET https://bar
    - [AfterEach]
   ...
maxcleme commented 4 months ago

Got it, now it's clear. Anyway, I suppose the issue could be closed now.

Your input has been very important, it has helped us clarify our upcoming actions. As stated earlier, we are currently using context merging, but your input has provided us with a solid alternative solution if we encounter any issues. We really appreciate you sharing your ideas, experience and time.

Thank you so much!

onsi commented 4 months ago

Sure thing - would be happy to help! And FWIW it makes sense that you were context merging - AfterFunc is very new.

I’ll close this out for now but feel free to re-open it or open a new issue if you need. I’m also often in the ginkgo channel on the golang slack if you ever want to chat with more back and forth.