pact-foundation / pact-specification

Describes the pact format and verification specifications
MIT License
292 stars 28 forks source link

Support for extending Pact with plugins #83

Closed mefellows closed 2 years ago

mefellows commented 3 years ago

Project tracking board

https://github.com/pact-foundation/pact-plugins/projects/1

Background

Pact was created initially to support the rise of RESTful microservices and has grown to be the de-facto API contract testing tool.

One of the strengths of Pact is its specification, allowing anybody to create a new language binding in an interoperable way. Whilst this has been great at unifying compatibility, the sprawl of languages makes it hard to add significant new features/behaviour into the framework quickly (e.g. GraphQL or Protobuf support).

The "shared core"

We have attempted to combat this time-to-market problem, by focussing on a shared implementation (the "shared core") in many of the languages. We initially bundled Ruby, because it was convenient, but have been slowly moving to our Rust core which solves many of the challenges that bundling Ruby presented.

It is worth noting that the "shared core" approach has largely been a successful exercise in this regard. There are many data points, but the implementation of WIP/Pending pacts was released (elapsed, not effort) in just a few weeks for the libraries that wrapped Ruby. In most cases, an update of the Ruby "binaries", mapping flags from the language specific API to dispatch to the underlying Ruby process, a README update and a release was all that was required. In many cases, new functionality is still published with an update to the Ruby binary, which has been automated through a script.

Moving beyond HTTP

But, the industry has continued to innovate since Pact was created in 2013, and RESTful microservices are only one of the key use cases these days - protocols such as Protobufs and Graphql, transports such as TCP, UDP and HTTP/2 and interaction modes (e.g. streaming or server initiated) are starting to become the norm. Standards such as AsyncAPI and CloudEvent are also starting to emerge.

For example, Pact is still a rather HTTP centric library, and the mixed success in retrofitting "message support" into all languages shows that extensions outside of this boundary aren't trivial, and in some respects are a second class citizen.

The reason is simple: HTTP doesn't change very often, so once a language has implemented a sensible DSL for it and integrated to the core, it's more a matter of fine tuning things. Adding message pact is a paradigm shift relative to HTTP, and requires a whole new developer experience of authoring tests, integrating to the core and so on, for the language author to consider.

Being able to mix and match protocol, transport and interaction mode would be helpful in expanding the use cases.

Further, being able to add custom contract testing behaviour for bespoke use cases would be helpful in situations where we can't justify the effort to build into the framework itself (custom protocols in banking such as AS2805 come to mind).

To give some sense of magnitude to the challenge, I put this table together well over a year ago that shows some of the Pact deficiencies across popular microservice deployments. In my consulting career (which not-so-coincidentally also aligns quite closely with my Pact maintainership) I've encountered all of those technologies in one form or another.

83211994-ced39200-a1a1-11ea-8804-19b633cbb1d6

The "shared core" approach can only take us so far, and we need another mechanism for extending behaviour outside of the responsibilities of this core. This is where I see a plugin approach working with our "shared core" model.

Objectives

  1. Increase the capability and richness of the Pact ecosystem
  2. Reduce time-to-market for a new feature
  3. Reduce the barrier of entry to creating new features (previously, to have a broad impact you had know one of 2 fairly obscure languages: Ruby or Rust)
  4. Increase the number of contributors making new features for Pact (should mostly flow from [3])
  5. Make it easy to use a new feature

Proposal

The current proposal would involve:

  1. Creating an HTTP (or RPC style such as gRPC) based plugin infrastructure in the Pact Reference library that was plugin aware and could communicate to a user-configured plugin (I have spiked this with golang already)
  2. Updating each implementation to support a generic plugin type (potentially namespaced by the plugin name)
  3. Supporting serialising of arbitrary interaction types in the pact file
  4. (eventually) creating a rich library (probably an extension of one of the existing crates such as libmatching) that can help reduce boilerplate for each plugin (e.g. for flexible matching)

Example serialised pact file:

{
  "consumer": {
    "name": "TCPConsumer"
  },
  "provider": {
    "name": "TCPProvider"
  },
  "interactions": [
    {
      "type": "tcp",
      "description": "a hello request",
      "request": {
        "message": "hello"
      },
      "response": {
        "message": "world!"
      }
    }
  ],
  "metadata": {
    "pactSpecification": {
      "version": "4.0.0"
    },
    "plugin": {
      "name": "pact-foundation/tcp",
      "version": "1.0.0",
      "delimiter": "\r\n"
    }
  }
}

A type attribute could be added to interactions (see https://github.com/pact-foundation/pact-specification/issues/79) to denote that this is a non-standard interaction (there may need to be other discriminating information).

A separate section of the metadata could be used to store plugin specific configuration.

Pros/Cons

The benefit of this approach, would be that from a framework perspective, a single plugin infrastructure could be created once and any number of plugins could then leverage the framework.

It could also open up a much richer contributor community, as plugins could be written once in any language of the contributors choosing, and contribute a new feature to the entire framework in a single go.

The main downside is that because it's not part of the framework, it may suffer from not being a "first class citizen".

I see the plugin approach as a way of assessing product viability - if a plugin gains popularity/momentum, it could be a candidate for incorporating into the framework proper.

Caveats

Design

Plugin Design - Consumer

High Level Summary

  1. User is responsible for starting the plugin following plugin specific documentation. The plugin must start an administration HTTP server, which will be used by the framework to communicate instructions for each Test Session
  2. Pact is given plugin specific configuration - including the administration API details - which is then sent to the administration server to initialise a new test session. This step should result in a new service being started for use by the test code (e.g. a TCP socket or a protobuf server) and a unique session ID returned. Each session must be thread safe and isolated from any other sessions
  3. The Pact framework will maintain the details of the TestSession - including interactions, failures, logs etc.
  4. The calling code is now able to add Interactions to the plugin, which are stored by the framework and registered with the plugin. The plugin is responsible for defining what an Interaction looks like and how it should be passed in for its specific combination of protocol, payload, transport and interaction type.
  5. During Test Execution, the calling code communicates directly to the Mock Service provided by the plugin. The Mock Service is responsible for handling the request, comparing the request against the registered interactions, and returning a suitable response. It must keep track of the interactions that were matched during the test session.
  6. After each individual Test Execution, verify() is called to see if the expected Interactions matched the actual Interactions. Any mismatches are retrieved from the plugin and returned to the caller.
  7. If the Test Session was successful, write_pact() is called which will write out the actual pact file.
  8. The plugin is shutdown by the User code.

Consumer Sequence Diagram pact_consumer_plugin_sequence

Example consumer test

Here is an example for a raw "hello world" TCP provider. It should respond with "world!" if "hello" is sent:

func TestPluginPact(t *testing.T) {
    // Start plugin
    go startTCPPlugin()

    provider, err := v3.NewPluginProvider(v3.PluginProviderConfig{
        Consumer: "V3MessageConsumer",
        Provider: "V3MessageProvider", // must be different to the HTTP one, can't mix both interaction styles
        Port:     4444,                // Communication port to the provider
    })

    if err != nil {
        t.Fatal(err)
    }

    type tcpInteraction struct {
        Message   string `json:"message"`   // consumer request
        Response  string `json:"response"`  // expected response
        Delimeter string `json:"delimeter"` // how to determine message boundary
    }

    // Plugin providers could create language specific interfaces that except well defined types
    // The raw plugin interface accepts an interface{}
    provider.AddInteraction(tcpInteraction{
        Message:   "hello",
        Response:  "world!",
        Delimeter: "\r\n",
    })

    // Execute pact test
    if err := provider.ExecuteTest(tcpHelloWorldTest); err != nil {
        log.Fatalf("Error on Verify: %v", err)
    }
}

Plugin Design - Provider

High Level Summary

  1. User is responsible for starting the plugin following plugin specific documentation. The plugin must start an administration HTTP server, which will be used by the framework to communicate instructions for each Test Session
  2. Pact is given plugin specific configuration - including the administration API details - which is then sent to the administration server to initialise a new provider Test Session.
3. The user starts the Provider Service, and runs the verify() command
  3. Pact fetches the pact files (e.g. from the broker), including the pacts for verification details if configured, and stores this information.
5. For each pact, the framework will be responsible for configuring provider states, and sending each interaction from the pact file to the plugin. The plugin will then perform the plugin-specific interaction, communicating with the Provider Service and returning any mismatches to the framework. This process repeats for all interactions in all pacts.
6. The Pact framework will maintain the details of the TestSession - including pacts, interaction failures, pending status, logs etc.
  4. Pact calculates the verification status for the test session, and optionally publishes verification results back to a Broker
8. The Pact client library then conveys the verification status, and the User terminates all process.s

Provider Sequence Diagram pact_provider_plugin_sequence

Example provider test

Here is an example for a raw "hello world" TCP provider test.

func TestV3PluginProvider(t *testing.T) {
    go startTCPPlugin()
    go startTCPProvider()

    provider, err := v3.NewPluginProvider(v3.PluginProviderConfig{
        Provider: "V3MessageProvider",
        Port:     4444, // Communication port to the provider
    })

    verifier := v3.HTTPVerifier{
        PluginConfig: provider,
    }

    if err != nil {
        t.Fatal(err)
    }

    // Verify the Provider with local Pact Files
    err = verifier.VerifyPluginProvider(t, v3.VerifyPluginRequest{
        BrokerURL:      os.Getenv("PACT_BROKER_URL"),
        BrokerToken:    os.Getenv("PACT_BROKER_TOKEN"),
        BrokerUsername: os.Getenv("PACT_BROKER_USERNAME"),
        BrokerPassword: os.Getenv("PACT_BROKER_PASSWORD"),
        PublishVerificationResults: true,
        ProviderVersion:            "1.0.0",        
        StateHandlers: v3.StateHandlers{
            "world exists": func(s v3.ProviderStateV3) error {
                // ... do something
                return nil
            },
        },
    })

    assert.NoError(t, err)
}

Considered alternatives

The bulk of this thinking was done over the last year, whilst considering how to achieve a gRPC/Protobufs integration. It's a good candidate, because it has new interaction styles (e.g. streaming, server push), new transport (http/2) and different protocols (Protobuf, JSON).

Option 1. Build a shared library and link to Rust engine

Rust (the core Pact engine) is famously not dynamic, and very much likes to know about all code that can run in advance. Whilst libraries can be linked, integrating them at runtime as would be required by a general user-defined plugin system is not easily supported (and certainly not recommended).

Option 2: Don't build a plugin ecosystem, just do it in the core

Supporting a generic protobuf server suffers similar issues to (1) - (need for reflection), and the gRPC/protobuf ecosystem in Rust is fairly poor compared to other languages. So any attempt to do it directly in the Rust core would likely come up short.

I spiked creating a shared library in Golang that could be linked at compile time, which has great support for both gRPC and protobufs. Whilst I demonstrated that linking this library to Rust would work, I realised that every single language that wanted protobufs support would then need significant changes to add support for it in this way. Ditto for every other change.

Given how long it's taken to replace the core library to Rust in several languages thus far, this option seemed the least likely to succeed.

See also

uglyog commented 3 years ago

Wow, you beat me to it. I was going to spike a plugin framework and see if I could implement a CSV matcher for #81 that is supported in both Rust and JVM versions.

mefellows commented 3 years ago

Added consumer sequence diagram and some additional summary information. Note that the above test code that is shown currently works.

mefellows commented 3 years ago

Added provider sequence diagram with summary. I have not yet been spiked this.

bethesque commented 3 years ago

👍🏽

uglyog commented 3 years ago

Here are some of the things I've been thinking about:

  1. How are plugins found? Can they be automatically installed from somewhere, or is it manual?

  2. Can we avoid having to manage separate processes? What about platforms that don't support DLLs? (Alpine/musl)

  3. Can we avoid the HTTP overhead if everything is running local?

  4. How do we manage plugin dependencies? Plugins written in Ruby, Python, JVM, .Net all have system runtime dependencies and may require a particular version. Ruby gems and Python modules may require specific system libraries to be installed.

  5. If the plugins run as separate processes, what happens when the process dies?

  6. With separate processes, how do we provide a consolidated view of logs?

  7. How do all the language DSLs know about the functionality provided by the plugin?

  8. Can plugins provide more functionality than just the protocol and transport? For instance, a plugin that provides new types of matchers or generators.

  9. How do plugins know about the functionality provided by other plugins? For example, a new protocol receives a CSV payload, but matching that is provided by another plugin. How can they invoke them?

I have some thoughts on how to address 1-4 and 8 and 9.

mefellows commented 3 years ago

Thanks Ron, great questions!

To answer your questions (at least as it pertains to the proposal above) it's worth mentioning a few things first:

  1. Different plugin types may have different needs, and therefore the HTTP based approach may not be suitable for all plugin types (e.g. for matching, it might be better served via a shared lib).
  2. The current philosophy is to embrace the polyglot nature of the toolchain (benefits described above). This implies plugins should be able to be written in any language/runtime the author chooses. This will almost certainly lead to portability issues and differences in how processes are managed (e.g. an author creating a plugin specific to their immediate use case), but that's an acceptable tradeoff I think as we can always look to improve the situation.
  3. Specifically on the "matching" plugin type, one wonders if that is still best contributed directly to the core, rather than via an extension (for the reasons described above). We could create an FFI approach for contributors to create new matching types that can be used both by core Pact libraries, and also any plugin itself. I'm keen to hear your thoughts, because mine are not very fleshed out in this regard.
  4. On the ability to have more modularity in the plugin ecosystem (e.g. separate plugins for transport, protocol, matching etc. that can then be mixed and matched) I've not got a good answer to. The "considered alternatives" touches on some of the issues/constraints that I can into, and why I think the HTTP provider is a good starting point. I do have some ideas, but none of them have passed just a small amount of reasoning about.

Here are some of the things I've been thinking about:

How are plugins found? Can they be automatically installed from somewhere, or is it manual?

Personally, I'd have love to have implemented something like the way Terraform does it. The plugin is installed in a known location and can be discovered by the system automatically. Because the plugin implements a typed interface, the framework knows how to start/communicate to it/stop it etc.

This approach works well in a homogenous environment (the Hashicorp tooling has now standardised on Golang, so can take advantage of high performance Go RPC calls etc.). A similar approach may be appropriate for matching libraries for the HTTP based plugins to use.

The next approach is standard shared library approach, but this gets more complicated cross platforms (and has the Alpine/musl issue you note). It's hard to discover these at runtime, and Rust is not a fan of runtime things like this.

The HTTP path, whilst not the most performant, is a really simple way to inter-operate across languages. Two examples come to mind that use this approach:

Can we avoid having to manage separate processes? What about platforms that don't support DLLs? (Alpine/musl)

As above, depends on the plugin approach. We could still have the framework start the plugin (the references above know how to do this, so it's doable).

Can we avoid the HTTP overhead if everything is running local?

For *nix systems, sockets could work. But HTTP is very portable in that sense. Any ideas on how to do this x-platform in a nice way? The upside of HTTP is that it's very comprehensible, and therefore more easy to debug.

What were you thinking?

How do we manage plugin dependencies? Plugins written in Ruby, Python, JVM, .Net all have system runtime dependencies and may require a particular version. Ruby gems and Python modules may require specific system libraries to be installed.

As per above, this is going to depend on the plugin type.

If the plugins run as separate processes, what happens when the process dies?

Indeed this is a challenge. If it were spawned as a sub-process, it shouldn't zombie out. But we know this isn't always possible.

With separate processes, how do we provide a consolidated view of logs?

I considered this. A GET /sessions/:id/logs endpoint was suggested, but each plugin could also be free to log as it wishes. Another option is to have a generic log object in each API response that contains messages to display.

We have this problem more generally when integrating the Ruby/Rust processes today, so it's good to be thinking about it now.

How do all the language DSLs know about the functionality provided by the plugin?

They don't. Or at least, the standard raw plugin provider doesn't. On the consumer side, it simply accepts an object which contains a JSON structure that represents the interaction the plugin provider supports. We can't assume much about the interaction, otherwise we'll likely prevent certain use cases.

This is one of the tradeoffs of the above approach - the plugin itself will not have a first class DSL in each language, unless the plugin author contributes that. But because the default interface simply accepts a string (probably JSON encoded, but that isn't necessarily a requirement) and corresponds to the general "interaction" model of Pact, it provides enormous flexibility and ultimately enables a plugin to be added to any language without submitting a PR to it.

So the general message is that the plugin interface should be easily wrapped/extended, with plugin specific interfaces. As a fairly basic example, here is what the TCP provider could look like with the raw interface, vs a specific interface:

Generic interface

    // Plugin providers could create language specific interfaces that except well defined types
    // The raw plugin interface accepts an interface{}
    provider.AddInteraction(tcpInteraction{
        Message:   "hello",
        Response:  "world!",
        Delimeter: "\r\n",
    })

Typed interface:

    // Plugin providers could create language specific interfaces that except well defined types
    // The raw plugin interface accepts an interface{}
    tcpProvider.
        AddInteraction().
        Given("world exists").
        UponReceiving("A request to do a hello").
        WithRequest("hello").
        WillRespondWith("world!").
        DelimitedBy("\r\n")

You could imagine a similar thing with gRPC with protobufs, or GraphQL:

    graphQLProvider.
        AddInteraction().
        Given("world exists").
        UponReceiving("A request to mutate the world").
        WithRequest(model.HelloQuery{
            Message: "$message"
        }).
        WithVariables(map[string]interface{}{
            "message": "hello",
        })
        WillRespondWith(
            model.HelloWorldResponse{
                Message: "world!"
            }
        )

But ultimately, the plugin will be marshalling this to a string to pass to the provider which will know what to do with it. This is where I imagined a matching library could be useful.

Can plugins provide more functionality than just the protocol and transport? For instance, a plugin that provides new types of matchers or generators.

Yes (but probably not via the HTTP approach)

How do plugins know about the functionality provided by other plugins? For example, a new protocol receives a CSV payload, but matching that is provided by another plugin. How can they invoke them?

Very good question. I'm not sure, but it certainly would be ideal.

ringods commented 3 years ago

Wow! For real: even today, a few colleagues and I were discussing between Mountebank and Pact regarding this extensibility. Mountebank has this already, but it doesn't have the pact broker counterpart.

It might help to have a look at their design:

http://www.mbtest.org/docs/protocols/custom

A step further in alignment would even be to make such setups compatible regarding the protocol between the core library (pact/mountebank) and the custom protocol implementation.

mefellows commented 3 years ago

Thanks @ringods - funny timing! As you'll see above in my comments, mb was one of the tools for this inspiration (somebody actually recently linked it to me on the forums so the credit theirs).

I haven't thought much beyond creating new "pact like" functionality with plugins at this stage, but that's more due to a lack of imagination at this point. Any ideas about how to achieve this within the stated goals above (or how to expand those stated goals) would be welcomed indeed.

ringods commented 3 years ago

@mefellows if you are referring to this comment:

https://pact.canny.io/feature-requests/p/support-grpc

That was also me. 😆

mefellows commented 3 years ago

Yes! That was it, I couldn't find it in slack/email so couldn't credit you - thanks for the tip!

codyaray commented 3 years ago

I'm not sure how deeply you've looked into the Terraform plugin system, but it uses this library: https://github.com/hashicorp/go-plugin. We used it in our golang-core CLI project to provide a plugin architecture for a while. It is a solid framework to support plugins in multiple languages over GRPC with all the bells and whistles (logging, stdin/stdout syncing, etc) and may be good to investigate or model on, if you decide to go this route. Note that we ultimately ended up just moving all the plugins into "core" so we had a single (small) executable. Each of the plugins were almost the size of the fully bundled executable due to dependencies, etc and that wasn't a good product tradeoff for us. But YMMV. :)

mefellows commented 3 years ago

Thanks! gRPC is a definitely a potential option here also due to it's performance. I spiked with REST because it's easy for most to understand, implement and debug. I haven't used go-plugin before, but I I've had nothing but good experiences with the Hashicorp infra tooling - so that does not surprise me!

uglyog commented 3 years ago

Here is a sequence diagram showing how two plugins could work together pact-plugin

mefellows commented 3 years ago

Love it Ron. I think the idea of a manifest and creating a very well defined process and scope for how the plugins are launched is a smart one, so that we can simplify the user experience (objective 5). You could imagine a "starter kit" or template for such things, to simplify that starting point for authors too (objective 3).

I'm keen to see the results of the spike for the N:M plugin architecture allowing the communication across plugins (or did you already). The value is obvious (i.e. objective 1), my concerns are twofold:

  1. Does this overhead in communication lead to any meaningful performance degradation? (I suspect not at most scales, but do wonder at the enterprise scale of some of the Pactflow customers if it would)
  2. How complicated does this make the plugin authoring experience (i.e. objectives 2,3 and 4) ? If authors can choose not to use another plugin and that simplifies the experience, the point is probably moot.

I think it would be helpful to see some pseudo code or descriptions for how a plugin might actually go about implementing a plugin, and how the concept of plugin types (e.g. a matcher, protocol, transport etc.) could work together this way.

mefellows commented 2 years ago

You can follow this feature along at https://github.com/pact-foundation/pact-plugins.

I think we should close this issue, because from a specification point of view it has been accepted in v4 and now implementation across languages is underway.