kernelci / kernelci-core

Core KernelCI tools
https://kernelci.org
GNU Lesser General Public License v2.1
107 stars 97 forks source link

Event filtering in pipeline #2121

Closed nuclearcat closed 8 months ago

nuclearcat commented 1 year ago

Related to: https://github.com/kernelci/kernelci-core/discussions/1907 At current moment we have very basic event filtering which we need to extend.

I reviewed existing solutions and found following:

Most of them designed to do complex operations with data, not filter matching. There is few more libraries that might be used as filters, but this is not their primary purpose, and they might change "second-class" features at any time.

What we also can do, build our own solution around lexx/yacc or pyparsing.

I propose to add following filters (PoC examples):

Where we can unroll expression to:

['data'] must be equal to 'checkout'
AND
(('branch' must match regex '/mainline/') OR ('branch' must match regex '/stable/'))
AND
('version' must be greater than '6')

So we should support following operators:

What is next? We need to decide if it cover all our use cases, and if it is good enough - we decide how to implement it. Also i dont rule out using some existing library, if someone know one that suits our needs.

gctucker commented 1 year ago

At current moment we have very basic event filtering which we need to extend.

Actually I'm hoping to not need filters with the new YAML format. The legacy system does a lot of things implicitly, and then filters are needed to disable some of that. It's very hard to maintain and debug. With the new system, we should try to positively describe all the jobs that need to be run instead to avoid this issue. One thing that needs to be implemented is some kind of pattern mechanism to generate multiple jobs rather than having every single case describe entirely.

nuclearcat commented 1 year ago

Thats what i am proposing, pattern mechanism. I call it filter, it doesn't really matter. As you said in new pipeline such filter(pattern, whatever you call it) required to match some of node data, so it will decide if it should run it or just ignore. AFAIK there is no such mechanism in place and i need it to filter only "compatible" kernels for particular models of Chromebooks.

gctucker commented 1 year ago

OK there are two different things here, one is about filtering events to not receive ones that aren't relevant and optimise the traffic and the other one is about YAML configuration so that the scheduler behaviour is more explicitly described.

nuclearcat commented 1 year ago

Ok, i will try to explain my point, as i'm seeking solution that will do following:

  - job: tast-codecs-test
    event:
      channel: node
      name: kbuild-gcc-10-x86
      result: pass
      filter:
        - (type="build")AND(branch~=/mainline/)AND(kernel.fragments~=/videodec/)AND(kernel.arch="arm64")
    runtime:
      type: lava
    platforms:
      - qemu-x86

(I know type doesn't exist, but maybe this is one of things to consider adding to schema, it's good to know what this node about, checkout, build, test result, etc.) I think it is self-explanatory why i need to do this.

gctucker commented 1 year ago

Yes that's obviously what the job configuration issue is about and it's one thing I've been trying to work on for the past few weeks. Now I'm planning to return to it next week as it's the only roadmap item showing as late on the roadmap: https://github.com/kernelci/kernelci-pipeline/issues/217

I'll start by describing a bit more what I have in mind and see whether this matches your requirements. We probably can share the implementation too.

gctucker commented 1 year ago

I don't think the filter approach here is needed though, the event entry should already be providing this. The current pipeline entries show a skewed view on how that's supposed to work since kbuild-gcc-10-x86 is just a hard-coded hack to get things bootstrapped. A more realistic entry would probably look like this:

  - job: tast-codecs-test
    event:
      channel: node
      name: kbuild
      result: pass
      revision.tree: mainline
      data.arch: arm64
      data.fragments: videocodec

This example implies the fields have to match the values exactly, in reality we would want any kernel build with the videocodec to be tested even if it also has other fragments enabled. But that's the kind of detail that needs to be worked out, at least the design already takes into account the fact that events should have all the information to explicitly cause a job to be run in the scheduler configuration.

nuclearcat commented 1 year ago

Let's assume we have 10 trees, each have 3 compiler combinations (gcc-10, clang-16, clang-17). All of them have various options to compile, including kselftest kernels. Let's assume at least two. That means at least 60 combinations. I want to test baseline tast tests on all kernels except those include kselftest (thats real use case). With filters i suggest it will be simple to define:

      filter:
        - (type="build")ANDNOT(kernel.fragments~=/kselftest/)

or

      filter:
        - (type="build")&!(kernel.fragments~=/kselftest/)

How we can do that with event entries?

gctucker commented 1 year ago

That's purely an implementation thing, I would personally prefer regex but other logic can be used too. This is not quite a "filter" like we have in the legacy system where things will implicitly match without specifying anything, here it's more clearly stating that such jobs should only be scheduled when the event matches the criteria in YAML. At least that's the intention, the main point here is to maintain flexibility and have a syntax that's easier to maintain and debug than the legacy one.

To answer your question, here's a suggestion - again, it's still all up in the air right now and fragments should be a list so just for the sake of the discussion, let's assume we have the same defconfig string format as with the legacy system:

  - job: tast-codecs-test
    event:
      channel: node
      name: kbuild
      result: pass
      revision.tree: mainline
      data.arch: arm64
      data.defconfig: '(?!.*kselftest).*'

But that's basically doing the work I was planning to do next week :) In this case it does look like a filter, as it's to match events that have all the other fields but not a particular pattern in the defconfig. I think it's the exception rather than the rule with the new scheduler config though, and arbitrary data could be used to say that the build is to be used for videocodec. For example, as a totally arbitrary and hypothetical scenario:

  - job: tast-codecs-test
    event:
      channel: node
      name: kbuild
      result: pass
      revision.tree: mainline
      data.arch: arm64
      data.features.videocodec: true
      data.features.kselftest: false

I noticed lots of people defaulted to using regex with the legacy system rather than the custom filter logic with passlist / blocklist / combinations as it's a more generic and universal way of describing things. If you want to suggest other syntax, I guess that's what the YAML discussion on GitHub is about and we can go through this in the weekly tomorrow.

nuclearcat commented 1 year ago

I like the first one, i think it will cover all needs. I will try to simulate existing filters and convert to this event matching system. If you are planning to implement this week, this is amazing news, thanks!

r-c-n commented 1 year ago

One comment about this definition:

  - job: tast-codecs-test
    event:
      channel: node
      name: kbuild
      result: pass
      revision.tree: mainline
      data.arch: arm64
      data.defconfig: '(?!.*kselftest).*'

I understand the processed event pattern will be similar to:

event :=  (channel = "node") AND (name = "kbuild") AND (result = "pass") AND (revision.tree = "mainline") AND (data.arch = "arm64") AND (data.defconfig ~= <whatever_regex>)

That is, each sub-attribute of "event" can match a regex and then the resulting event will be the one that matches all of them.

This doesn't quite match the expressiveness of a regular expression, are we sure we aren't setting premature limits and restrictions in what we can and cannot define? For instance, will it ever make sense to define a job whose "event" could be (channel = "node") AND (((data.arch = "arm64") AND (data.defconfig ~= <regex_1>)) OR (data.defconfig ~= <regex_2>)) ? I mean arbitrarily complex patterns that aren't just a concatenation of AND conditionals.

gctucker commented 1 year ago

I think we can define a simple YAML syntax with just regex matching support as that should cover most cases and be useful, then we can add other features on top of this in a backwards-compatible way so we're not limiting ourselves.

The idea is also to not have the need for complex patterns as that doesn't actually work from a community point of view, kernel developers who just want to enable their tests can't spend lots of time working out the magic incantation to make KernelCI do the right thing. So the main goal is to have a design that covers the vast majority of cases with just "key: value" simple criteria. For very special corner-cases, hopefully a couple of cryptic regex patterns can cover that if we need to - and if that's not good enough then another syntax with logic operators might work. But that should be the exception rather than the rule.

a-wai commented 8 months ago

I took a stab at this in #2413 , the idea being that we should have:

It still misses regex support but it might not be too useful, simple "glob pattern" matching could be enough. Either way, this can be added as a follow-up.