[meta] Improvements in current Logstash testing infrastructure

purbon commented 8 years ago

The motivation of this issue is to track and discuss the logstash efforts to improvements our current testing efforts in regards of unit, integration and acceptance testing.

Test should have proper organization, so it's easy for developers to know they should fulfil all necessary gaps before thinking the feature is testing complete.
- Coverage can help knowing there are missing parts.
Test should be able to run locally and remotely. Developers should not have to rely on the CI infra, but they should be able to either run vs a local VM (for example to debug issues in specific distros), or vs their local environment (properly sandboxed).
Invoking each kind of test should be a one command call, including acceptance and integration test.
A set of integrations test, using common configuration should be created. This should be part of the logstash-core project and don't necessary have to use the logstash command interface, just create internal pipelines. This must be our white box testing efforts, and must be mostly focused on testing pipeline interaction with plugins. Some of this is already available inside our testing code base, but should be more clearly DRY'ed.
Another set of integration test should be created where the integration is tested from the user point of view, this will include external components like elasticsearch, kafka, etc. In correlation with the acceptance test layer this will benefit us checking how LS is behaving. This will necessarily need a one command that deals with this external components, providing users of this test with simplicity. There has been a PoC experiment by @talevy at https://github.com/talevy/elastic-stack-tests, we shuold also leverage existing cross platform efforts.
We should have complete acceptance test that run regularly (each commit/nightly/...). Acceptance test are basically automatic validations for our deliverables from the user point of view, so they should check for common user actions and behaviour. Related #4868 #2852 .

The initial plan to get this up and running would be.

[ ] Organize internal testing code base as clean as possible so developers know which testing parts they should always fill.
[ ] Provide internal helpers to facilitate the creation of internal pipelines and later on the retrieval of test expectations out of them.
[ ] Create a first iteration of an acceptance testing layer, this will include testing:
- [ ] Plugin manager behaviour, including CLI flags. #5020 (user stories)
- [ ] Standard apache config.
- [ ] bin/logstash CLI flags.
- [ ] Package installation and upgrade behaviours.
All this should be tested in all supported OS, and by using the generated packages from our codebase (aka snapshots).
[ ] Create/investigate on ways of spinning up external services to interact with things like kafka, elasticsearch, redis, etc..
[ ] Create a first iteration of an integration testing layer (black box), this could be focused on testing vs elasticsearch, or kafka output.
[ ] Develop a way of running the integration test either locally or remotely.
[ ] Develop a way of running the acceptance test either locally or remotely. (related to previous one)

Some random final notes:

Some this points don't need a fresh start as they are, in part, already there, but we need a more organized way of being sure we use them regularly.

If you are working on one of this task, please add your name on it.

Please attach issues/pulls/etc to each task so we're able to track the evolution of this here.

I tried to keep this only focused on behaviour and what we need/want for our testing infra, specific technology decisions should be keep in specific issues for each task.

And last but not least, this is not a complete issue, all your feedback is more than welcome!

This would not has been possible without the meaningful initial feedback from @talevy and @suyograo

ph commented 8 years ago

One thing I see missing from this plan, is the strategy behind dependent test or jobs.

I have came across some issues in the plugins that would have been catch if their test were triggered after the main core test are run. (this could be done daily) Maybe we could illustrate that in a flow chart.

Another point we have to take into consideration, these tests need to be run relatively fast. (IE, the all plugins is really slow)

Coverage can help knowing there are missing parts.

Coverage is a tool, not a solution, I think we should trigger a warning before we merge something if the coverage drop too much. I am sure solution already exist for that.

Small change here.

Develop a way of running the acceptance test locally AND remotely. It need to be identical to have reproducable run.

Ideally we should be able from within Logstash to do the following rake test:all and all the tests should be able to run. (I am OK if they take a while), Maybe the CI can use the same command I don't know.

All test tooling need to accept a seed and when possible be able to run a subset of the example for faster run/debugging session.

@purbon So far a great summary of what we want to achieve

ph commented 8 years ago

In last few releases I come in contact with a few missing translation issue I think we should do something about it. #5119

The default behavior of the I18n gem is to gracefully fallback to a "missing translation message". This behavior is fine when releasing an artifact but I think when we are testing the CLI, the plugin manager or anything that involves using the yml files for user string we should change this behavior and make it *FAIL widly.

So when running the tests we might want to add special code to change this behavior, WDYT?

andrewvc commented 8 years ago

I agree with everything in here. That's a pretty creative idea WRT I18n ph. It makes sense and is a common source of minor bugs.

As far as priority goes I think acceptance testing is #1 from my perspective.

We also need to evaluate what we want to move to Java and what will stay in ruby. If we're going to rewrite something soon it'd be better to make sure it has good Java tests and not waste time on ruby ones. OTOH writing good ruby tests makes for fewer subtle bugs in the porting process.

ph commented 8 years ago

I think the next step for this would be clarify how the developper/ci workflow would be and see if there any existing solution that exist or what we need to developped?

ph commented 8 years ago

Another point to add here that we missed and Its pretty important (thinking about @talevy ) the test need to be side effect free. When I run the test locally it shouldn't change my current environment.

suyograo commented 8 years ago

Good plan! A couple of additional points:

It would be nice to see another avenue for testing -- We can easily collect a lot of user/customer configuration + sample data. We then remove the input and output parts and keep only filters section. This would provide a real world acceptance test for user configs. It'd be nice if we could drop in the config, sample data and expected output in one location. Then a test runs and validates using these fixtures. Point is, it should be easy to drop in the config and sample data to the fixture lib and run tests against it.
Change current rspec tests tags from redis=true, integration=true, elasticsearch=true and just use the tag integration. Also move those integration tests to the new directory organization.

Test should have proper organization, so it's easy for developers to know they should fulfil all

+1. how about tests/integration, tests/acceptance, tests/unit

Few additional tests:

Tests with different inputs: dataset with lots of heavy documents (10MB each for example), large # of documents (>= 10MM lines.. this would have caught #4977 for example)
Mid Term: Performance regression tests for core and critical plugins. If I make a change to Redis plugin, I should know if i regressed performance compared to the previous baseline.
Long term: Multi deployment test - 3 LS instance sending to 3 node ES cluster.

Agree with @andrewvc biggest bang for buck ATM is to get the acceptance tests running consistently on CI.

purbon commented 8 years ago

Thanks you all for your feedback, this is much appreciate it!! Trying to provide now a bit of feedback to each of you.

purbon commented 8 years ago

@ph ->

One thing I see missing from this plan, is the strategy behind dependent test or jobs.

I have came across some issues in the plugins that would have been catch if their test were triggered after the main core test are run. (this could be done daily) Maybe we could illustrate that in a flow chart.

This is actually a very good point, defining how this dependencies should work is important!! we should see first, which are the basic tool set of test we aim for, and then how they could interact one after the other. We should open an specific issue for this.

Coverage is a tool, not a solution,

when talking about coverage I was not seeing this as an specific tool, but seeing this a something that should show us if we're (human understandable), missing some necessary test for a given feature. Must say not clear to me still how to solve that.

Develop a way of running the acceptance test locally AND remotely. Ideally we should be able from within Logstash to do the following rake test:all

We've test that will be able to run like you say here, but things like package acceptance test will not able able to work that way in all cases, we should see in which cases it makes sense to run locally/remote and in which cases we should aim for boths. The local run should not be a blocker for us to deploy some tests, while keeping this in mind we can always iterate and move towards this when it makes sense. I basically say that because some kind of test locally need isolation, and archive this is not an easy thing to do.

All test tooling need to accept a seed and when possible be able to run a subset of the example for faster run/debugging session.

my idea is to base all our testing tooling around rspec as framework, this give us a solution for this very valid point.

The default behavior of the I18n gem is to gracefully fallback to a "missing translation message". This behavior is fine when releasing an artifact but I think when we are testing the CLI, the plugin manager or anything that involves using the yml files for user string we should change this behavior and make it *FAIL widly.

Can you elaborate this a little bit more? is interesting to know more your thinking here.

I think the next step for this would be clarify how the developper/ci workflow would be and see if there any existing solution that exist or what we need to developped?

first step for now, and the most pressure, as we don't have it properly running are acceptance test for release artifacts. But we can certainly team up and see how our CI workflow can get improved, we can open a collateral issue to discuss approaches here.

Another point to add here that we missed and Its pretty important (thinking about @talevy ) the test need to be side effect free. When I run the test locally it shouldn't change my current environment.

this is what I shared earlier, we should aim for this and iterate around the way to get it, I don't us block without test because we can not get side effect free test in local machines. Do you get what I mean?

purbon commented 8 years ago

@andrewvc ->

As far as priority goes I think acceptance testing is #1 from my perspective.

+1 on this, as soon as we've this online we can iterate and move forward with other kind of test, we for sure will use different approaches, but still reuse code also.

We also need to evaluate what we want to move to Java and what will stay in ruby

can you elaborate a bit more on this? at my understanding staying with ruby test is not bad and also would be good as will benefit faster writing of test.

purbon commented 8 years ago

@suyograo ->

It would be nice to see another avenue for testing -- We can easily collect a lot of user/customer configuration + sample data.

I like this point, this fits in the side of having nice integration testing for the pipeline level, :+1:

Also move those integration tests to the new directory organization.

:+1:

Mid Term: Performance regression tests for core and critical plugins. If I make a change to Redis plugin, I should know if i regressed performance compared to the previous baseline.

wondering how this related to the performance test we have currently? is this something to improve there might be?

Long term: Multi deployment test - 3 LS instance sending to 3 node ES cluster.

love that, and is certainly on this plan. @andrewvc I do really need/love/require your expertise (we should reuse your idea/code for this)

ph commented 8 years ago

We've test that will be able to run like you say here, but things like package acceptance test will not able able to work that way in all cases, we should see in which cases it makes sense to run locally/remote and in which cases we should aim for boths.

I think we should still have the goal to be able to run everything locally even package acceptance testing. I don't see why they should be different.

Concerning the I18N, I will make another issue so this doesn't change the current flow.

ph commented 8 years ago

@suyograo

It would be nice to see another avenue for testing -- We can easily collect a lot of user/customer configuration + sample data. We then remove the input and output parts and keep only filters section. This would provide a real world acceptance test for user configs. It'd be nice if we could drop in the config, sample data and expected output in one location. Then a test runs and validates using these fixtures. Point is, it should be easy to drop in the config and sample data to the fixture lib and run tests against it.

This is exactly what I had in mind:

Provide a source data
Provide a filter chain
Provide an expectation file which could be a simple json file.

We can later improve step 1 and 3 to include physical service.

purbon commented 8 years ago

@ph because I can not install a .deb package in a windows machine for example, and there might be commands that works one way in redhat, but not in another machine, ... I'm not excluding this forever, just saying to iterate and start having test, then we move to local (as this needs isolation, and is not easy to archive).

purbon commented 8 years ago

@ph

This is exactly what I had in mind:

Provide a source data Provide a filter chain Provide an expectation file which could be a simple json file. We can later improve step 1 and 3 to include physical service.

should be archived by improving the way we do now, the sample helper should be "improved" and let us used in more standard rspec ways.

ph commented 8 years ago

@purbon You don't like the idea of the test to be automatically generated from fixtures?

ph commented 8 years ago

@purbon @andrewvc I have created https://github.com/elastic/logstash/issues/5122 to not let the I18n idea pollute this issue.

purbon commented 8 years ago

@ph I would answer with the ambiguous it depends, but generally this is a red flag for me.

ph commented 8 years ago

@purbon would you mind elaborate on that?

Not all the tests need to be fixtures. But I see it as an excellent way of testing filter interaction and it has the benefit of having a lower entry barrier for users. Lets discuss it and see why it can or cannot be a good solution for some part of this infrastructure.

ph commented 8 years ago

Concerning package acceptance test, We should take at look at beats-tester, They use if for testing their package on different platforms, including windows. It uses Vagrant as the hosting part, ansible as the automation and assertion part.

It takes 15 minutes to do a complete run but you can run individual playbook. I find it an elegant solution for the package acceptance test.

ph commented 8 years ago

I also had something similar, for a personal experimentation at https://github.com/ph/logstash-test-environments

purbon commented 8 years ago

Hi, my thoughts on:

It uses Vagrant as the hosting part, ansible as the automation and assertion part.

Vagrant as VM manager is kinda of non brainer, we need to use a simple command that lets us spin up machines and this is the common denominator tool with all the stack, so no questions here.

for the ansible part, for the provisioning is good, however for the assertion part is somehow an abuse of the purpose of this tool. I'm not discarding this forever, but we should be sure to take small steps and not get into another trap like we had with docker, or kitchen.ci . I propose we start as simple as possible, and we can iterate.

It takes 15 minutes to do a complete run but you can run individual playbook.

Speed is always in mind, however anything that uses vagrant and VM for acceptance test will be hit with a degree of slowness, actually provisioning is kind of the slowest part here.

I find it an elegant solution for the package acceptance test.

This is argumentable, using ansible is possible but for provisioning, using it as a test runner/executor is abusing the purpose of the tool.

suyograo commented 8 years ago

This is exactly what I had in mind:

Provide a source data

Provide a filter chain

Provide an expectation file which could be a simple json file. We can later improve step 1 and 3 to include physical service.

@ph @purbon @talevy I just came across this excellent tool written by @magnusbaeck for testing filters -- https://github.com/magnusbaeck/logstash-filter-verifier. This is exactly what we need IMO to drag-and-drop filters + sample data and validate them. @magnusbaeck we would love to collaborate with you on this and use it in our testing framework.

magnusbaeck commented 8 years ago

@magnusbaeck we would love to collaborate with you on this and use it in our testing framework.

Oh, great! Feedback is certainly welcome.

(Unless Docker's Linux-only property is a no-no I'm not sure why you're talking so much about Vagrant and so little about Docker, especially if overhead is seen as an issue.)

ph commented 8 years ago

@magnusbaeck @suyograo oh nice tool, we should certainly leverage this.

purbon commented 8 years ago

yes, worth exploring and seeing how to used for our tests, thanks for this contribution @magnusbaeck !

elastic / logstash

[meta] Improvements in current Logstash testing infrastructure #5013