Multi pipeline and pipeline-to-pipeline communication support

breml commented 3 years ago

Since Logstash version 6.0 there is support added for multiple pipelines. Running multiple pipelines allows users to isolate data flow and helps simplify complex configurations. At the moment there is no good way to test Logstash configurations with multiple pipelines in Logstash Filter Verifier (LFV), especially if the pipeline-to-pipeline communication (available since Logstash version 6.3) is used.

The idea of this issue is to discuss, how LFV could be extended such that Logstash configurations with multiple pipelines as well as pipeline-to-pipeline communication could be tested.

The addition of the support of pipeline-to-pipeline communication would also allow LFV to better wrap around the Logstash configuration under test. The flow could go like this:

LFV input pipeline with the necessary input to send events.
Optional filters to mutate the events according to the test specification, e.g. add fields, add deterministic fake time (normally Logstash adds the current time for the @timestamp field, for test szenarios this is not suitable and it would be beneficial, if a deterministic fake time could be provided)
Replace the input of the Logstash configuration under test with pipeline input
Replace the output of the Logstash configuration under test with pipeline output to the LFV output pipeline
Send event via pipeline-to-pipeline communication to the Logstash configuration under test
Post process the event in the LFV output pipeline (e.g. sanitize @metadata field, built in support for #5)
Output the event in the LFV output pipeline to the destination, where it is picked up by LFV

The advantage of this would be, that without more modifications to the Logstash configuration (config under test) than today (change inputs and outputs), LFV would be able to adopt to changes to the test suite by only changing the filtering part of the Logstash configuration (because some of the input plugins prevent the reload of the Logstash configuration). Additionally, the wrapping with pipeline-to-pipeline communication would also allow to send LFV specific controll messages to logstash, which are then received again by LFV. These could signal the start or the end of a test suite run and with this make the whole process more reliable.

magnusbaeck commented 3 years ago

Interesting idea! So I guess there are three reasons for doing the pipeline wrapping:

When implementing daemon mode (#94) we can prevent reloading of the non-filter plugins (assuming Logstash only tears down the pipeline whose configuration files were actually changed).
We can add LFV-specific filters that are guaranteed to hit the messages first or last.
We can pass control messages without patching the configuration under test to introduce skip-if-control-message conditionals.

I hope we can do this in a backwards-compatible manner.

breml commented 3 years ago

Interesting idea! So I guess there are three reasons for doing the pipeline wrapping:

When implementing daemon mode (#94) we can prevent reloading of the non-filter plugins (assuming Logstash only tears down the pipeline whose configuration files were actually changed).

We can add LFV-specific filters that are guaranteed to hit the messages first or last.

We can pass control messages without patching the configuration under test to introduce skip-if-control-message conditionals.

I hope we can do this in a backwards-compatible manner.

@magnusbaeck you got this exactly right.

About the backwards compatibility: I guess as part of the cleanup and with the introduction with this and the other features (daemon mode #94, and coverage #95), we will raise the bar in regards to the supported version of Logstash anyways. If we head for a version 2.0, the question would be, what would be the minimal version of Logstash that we need to support. I would argue, that users of older versions of Logstash need to update anyway at some point and until then they are free to continue using LFV v1.0.

Already now we have some sections in the code base of LFV, that take care of different version of Logstash (e.g. at some point the possibility to pass filter configs to logstash via flags have been dropped and a workaround needed to be found an implemented). Over all I guess, the code readability as well as the maintainability will benefit from dropping support for (very) old versions of Logstash.

magnusbaeck commented 3 years ago

You're probably right about the backwards compatibility. It's totally reasonable to only support, say, Logstash 6+ for LFV 2.x. If there are important bugfixes or small features that motivate backporting to 1.x that could always be done on a separate branch. Or if the 2.x stuff should be on a separate branch. I'll have to read up on how Go modules work in this regard. Another LFV 2.x feature could be to change the package structure so that not all packages (maybe none?) are public.

breml commented 3 years ago

If we decide to move most (or even all) of the packages into internal packages and the only non internal left would be the main package, we are in fact not really touched by the version implications of Go modules. Go modules come into play, when others use part of the LFV in their code. Are you aware of such cases? From what I see on https://pkg.go.dev/ and https://godoc.org/, this does not seem to be the case. So yes, I think moving the packages into internal is something we should do.

magnusbaeck commented 3 years ago

This is somewhat confusing. Go Modules: v2 and Beyond says

Starting with v2, the major version must appear at the end of the module path (declared in the module statement in the go.mod file).

and the go command documentation says

To preserve import compatibility, the go command requires that modules with major version v2 or later use a module path with that major version as the final element. For example, version v2.0.0 of example.com/m must instead use module path example.com/m/v2, ...

so it seems we have to choice but to introduce a v2 subdirectory and move almost everything there, i.e. we'd end up with something like this:

README.md
LICENSE
v2/
    go.mod
    go.sum
    cmd/
        lfvclient/
        lfvdaemon/
    internal/
        logstash/
        testcase/
        ...

But after reading https://therootcompany.com/blog/bump-go-package-to-v2/ I realize that it's only the module path, i.e. the module line in go.mod that needs to end with /v2. That go.mod could still be placed in the repository root. Making such a change (i.e. not introducing a v2 subdirectory) would definitely break anyone who's been importing the package, but as you say there are no obvious importers.

breml commented 3 years ago

@magnusbaeck What is you conclusion on this topic? Do you still plan to move all the existing packages into internal package directories? If so, this will break all existing importers anyway.

magnusbaeck commented 3 years ago

Yes, I plan to make the changes outlined above except skipping the v2 directory and just updating the module path. I haven't tested it.

I got stuck reworking the tests for the testcase package but that took longer than expected. I'll pause that work and migrate the package structure instead. I should have time for that tomorrow night.

jgough commented 2 years ago

This looks excellent - I'm trying to get multi-pipelines to work with the Beta but I'm not seeing any logging that can be used for debugging here. Maybe this is a case for adding some more logging?

The first issue I think I had is that my pipelines.yml has

- pipeline.id: prod
  path.config: "pipeline/prod/"

which exited with: logstash-filter-verifier: error: expect the Logstash config to have at least 1 input and 1 output, got 0 inputs and 0 outputs

Playing around a bit I think I need this format:

- pipeline.id: prod
  path.config: "pipeline/prod/*.conf"

Which seems like a minor bug in LFV if it doesn't accept pipeline paths without wildcards

I now get: Summary: ☑ All tests: 0/0

So now it is not picking up any of my tests and I'm still trying to debug at the moment. More logging here would be extremely useful in this case! I'm going to try simplifying my test cases and seeing where I get with this.

jgough commented 2 years ago

Looks like if there is a dash in the pipeline.id then it does not work:

If I set the pipeline.id as follows then it works fine:

- pipeline.id: testprod
  path.config: "pipeline/**/*.conf"

But if I change this to

- pipeline.id: test-prod
  path.config: "pipeline/**/*.conf"

Then the daemon run command returns:

☐ Compare actual event with expected event from main.json:
Expected 2 event(s), got 0 instead.
logstash-filter-verifier: error: rpc error: code = Unknown desc = destroy of session failed: session teardown failed: state unknown, failed to wait for state: ready_for_test, <nil>

and the daemon start command logs the following:

failed to wait for Logstash results: timed out while waiting for state: ready_for_test

breml commented 2 years ago

Hi @jgough

Thanks for your feedback. I have fixed the issues "dash in pipeline.id" and "path.config pointing to dir in pipelines.yml" in #125.

In order to get more debug information consider the following two options when you start the daemon:

Add the flag --loglevel DEBUG to get more verbose debug logging of the daemon it self.
Add the flag ---no-cleanup in order to keep the temporary directories created by LFV during execution of Logstash. The folder is logged during startup of the daemon in a log line like this:
```
Temporary directory for daemon created in "/tmp/lfv-203350830"
```
In this directory you will find the a folder logstash-instance with a subfolder for each Logstash instance started. In these folders you will find the file logstash.log with all the log output generated by Logstash.

I hope this is of help for your future debugging.

jgough commented 2 years ago

Thanks for the advice and for the fixes, that is extremely helpful. Am getting much closer to getting this working now.

Looks like dashes in plugin ids are also not working? input { stdin { id => "prod-input" } } does not work

but changing the id to prod_input seems to work

breml commented 2 years ago

@jgough ~~I just tested an input plugin with a dash (-) in the name and this worked for my with #125 applied. Can you confirm, that this works for you as well?~~

Sorry, I executed the wrong test. It does not work, I will investigate the issue.

magnusbaeck / logstash-filter-verifier

Multi pipeline and pipeline-to-pipeline communication support #93