elastic / package-spec

EPR package specifications
Other
17 stars 70 forks source link

[Fleet] Add rename message processor validation #690

Closed jillguyonnet closed 7 months ago

jillguyonnet commented 8 months ago

What does this PR do?

This PR adds the following validation rule for ingest pipeline processors.

If an ingest pipeline contains a rename processor with field: message and target_field: event.original, then:

Examples

Valid pipeline definition:

processors:
    - rename:
        field: message
        target_field: event.original
        if: 'ctx.event?.original == null'
    - remove:
        field: message

Invalid pipeline definition:

processors:
    - rename:
        field: message
        target_field: event.original
        # missing `if: 'ctx.event?.original == null'`
    - remove:
        field: message

Invalid pipeline definition:

processors:
    - rename:
        field: message
        target_field: event.original
        if: 'ctx.event?.original == null'
    # missing remove processor

Skipping validation

The validation error has the JSE00001 error code that allows packages to skip it (cf. skip_pipeline_rename_validation test package).

Why is it important?

There is an existing issue for integrations using a rename processor when the target field already exists: https://github.com/elastic/integrations/issues/3451#issuecomment-1693521101. This validation aims to check that no field called event.original exists if there is a rename processor that renames message to event.original (what was done in https://github.com/elastic/integrations/pull/7026).

Additional changes

Checklist

Related issues

jillguyonnet commented 8 months ago

@jsoriano @mrodm This is a first pass at the validation rule proposed in https://github.com/elastic/package-spec/issues/583. Can you please comment on the following:

  1. Assuming this validation is enough for now:
    1. Are there any missing rules? (ignore_missing, if in remove)
    2. Can we test it against existing integrations?
    3. Would it be considered a breaking change?
    4. Should it follow a version constraint?
  2. Obviously the hard-coded field names are restrictive. I'm not sure, however, how https://github.com/elastic/package-spec/issues/583#issuecomment-1729188866 could be addressed properly: can we assume that a renamed field should always be removed? (Edit: perhaps it could be a warning)
  3. Are there other avenues we want to explore? Jaime had mentioned the following options:
    1. Static pipeline analysis
    2. Black box testing with elastic-package test pipelines
mrodm commented 8 months ago

Hi @jillguyonnet ! I'll try to answer some of these questions

  1. Are there any missing rules? (ignore_missing, if in remove)

I think it would also be needed to check that the remove processor has defined the ignore_missing and if condition (reverse condition as in rename). That would avoid that the pipelines fails for instance if message does not exist when removing it.

2. Can we test it against existing integrations?

It can be tested directly in this Pull Request, you could add the comment test integrations each time you want to test with all the packages in integrations repo, and it will be created a new PR in the integrations repository updating package-spec to the latest changeset of this PR / branch. Here there are some docs about this feature https://github.com/elastic/package-spec/blob/main/CONTRIBUTING.md#testing-with-integrations-repository

  • Would it be considered a breaking change?
  • Should it follow a version constraint?

From my POV, probably it would be a breaking change, since there could be some packages that would start failing because of this new validation (to be checked when tested with integrations repository).

I would add a version constraint to these changes, so this new validation just take effect from a specific spec version. If it is achieved using the JSON schema itself, it would be needed a JSON patch to remove before the following version.

2. Obviously the hard-coded field names are restrictive. I'm not sure, however, how Add new validation rule to avoid rename processor without checking event.original #583 (comment) could be addressed properly: can we assume that a renamed field should always be removed? (Edit: perhaps it could be a warning)

If it is needed to consider more fields apart from message, would it be better to move this check to be a semantic validation rule? Would it be easier to implement this check ? There is also the possibility to just run this validation from a specific version adding the since parameter. https://github.com/elastic/package-spec/blob/5943cc23c2e168f534ed761a7779d53dde93104f/code/go/internal/validator/spec.go#L151-L172

WDYT @jsoriano ? Should it be needed to consider any renamed field in this validation rule ?

Are there other avenues we want to explore? Jaime had mentioned the following options:

  1. Static pipeline analysis
  2. Black box testing with elastic-package test pipelines

These are good questions, because I don't know if there could be some integration removing or renaming some fields using painless scripts (script processor). Or if there is more than condition in the if field. Currently, I don't have any other suggestion for this apart from the ones mentioned by Jaime.

jsoriano commented 8 months ago

Nice, I didn't consider this could be done in the schema directly. I think this is fine this way, unless we need to handle lots of corner cases and the schema starts getting convoluted.

I think that next step would indeed be to test with current integrations, to see if this new rule actually detects valid cases. We should also take a look and try to find other cases that we should catch. Once we have results on this we can decide how to follow.

As Mario mentioned, this rule should only apply on packages using the next version of the schema, we can apply a patch in the schema for this, but lets confirm first how this works with existing packages. I think it is ok to make it a "breaking change" because it fixes things, we have made this kind of changes in the past.

Regarding black box testing, we would need to investigate if we can programatically forge events that can be used to validate this rule in a general way. It'd be nice to have these kind of checks, but not sure if possible. In this case it would imply to run the tests with events containing event.original.

jsoriano commented 8 months ago

test integrations

elasticmachine commented 8 months ago

Created or updated PR in integrations repository to test this version. Check https://github.com/elastic/integrations/pull/8887

jillguyonnet commented 8 months ago

test integrations

jillguyonnet commented 8 months ago

test integrations

elasticmachine commented 8 months ago

Created or updated PR in integrations repository to test this version. Check https://github.com/elastic/integrations/pull/8887

jillguyonnet commented 8 months ago

Hey @jsoriano - apologies, I pushed a commit after your comment yesterday (there was a needed fix).

There are many integrations that fail this validation as can be seen in https://github.com/elastic/integrations/pull/8887. The ones I've looked at seem expected: for instance, validation of 1password fails which makes sense because this pipeline doesn't meet the criteria (no matching remove processor). So there will be quite a few adjustments if we go ahead.

A couple of questions:

  1. I had added a version constraint (removed in the last commit for testing purposes), but the -next version yields some validation failures, which seems like a catch-22 as I'd need at least the good_v3 and bad_ingest_pipeline test packages to use this version to test the changes. How do I address this?
  2. I wanted to put more validation tests as there are many ways to break the validation, but I struggled to test with more than one data stream because of the test cases map format (which seems to take one package name and one file name, if I'm not mistaken). I added a bad_ingest_pipeline_v2 test package as a workaround, but would there be a way to simply add more data streams to bad_ingest_pipeline and test those?
jillguyonnet commented 8 months ago

FYI https://www.jsonschemavalidator.net/ is as always super useful to play with the schema definition, posting the test JSON here for reference.

Schema ```json { "type": "object", "properties": { "processors": { "type": "array", "items": { "type": "object", "additionalProperties": false, "properties": { "rename": { "type": "object" }, "remove": { "type": "object" } } }, "if": { "contains": { "required": [ "rename" ], "properties": { "rename": { "properties": { "field": { "const": "message" }, "target_field": { "const": "event.original" } } } } } }, "then": { "allOf": [ { "contains": { "required": [ "rename" ], "properties": { "rename": { "required": [ "if" ], "properties": { "if": { "const": "ctx.event?.original == null" } } } } } }, { "contains": { "required": [ "remove" ], "properties": { "remove": { "required": ["field", "ignore_missing", "if"], "properties": { "field": { "const": "message" }, "ignore_missing": { "const": true }, "if": { "const": "ctx.event?.original != null" } } } } } } ] } } } } ```
Test definition ```json { "processors": [ { "rename": { "field": "message", "target_field": "event.original", "if": "ctx.event?.original == null" } }, { "remove": { "field": "message", "ignore_missing": true, "if": "ctx.event?.original != null" } } ] } ```
jsoriano commented 8 months ago

There are many integrations that fail this validation as can be seen in elastic/integrations#8887.

There are indeed many of them. I wonder if any of them are doing the removal in a different way. With so many packages failing because of this we may need to think twice about the strategy to address this.

So there will be quite a few adjustments if we go ahead.

This can be fine, we will add the new validation in a new version of the spec so packages will only need to be updated when updating to this version. We can also add an error code to the error message, so developers can skip this check if they don't want to fix it at the moment.

1. I had added a version constraint (removed in the last commit for testing purposes), but the -next version yields some validation failures, which seems like a catch-22 as I'd need at least the good_v3 and bad_ingest_pipeline test packages to use this version to test the changes. How do I address this?

I would suggest to continue without the version constraint while we confirm the implementation. For testing yes, we can use the next version in good_v3 and bad_ingest_pipeline.

2. I wanted to put more validation tests as there are many ways to break the validation, but I struggled to test with more than one data stream because of the test cases map format (which seems to take one package name and one file name, if I'm not mistaken). I added a bad_ingest_pipeline_v2 test package as a workaround, but would there be a way to simply add more data streams to bad_ingest_pipeline and test those?

Yes, as this is now you would need to add one package per test case, this is similar to the issue that I plan to address in https://github.com/elastic/package-spec/issues/676.

For this case maybe you can maybe create a new test with multiple test cases, each case would reference a pipeline, and then before the execution you create a package by copying a base package to a tmp dir, and then copy the specific pipeline to this package.

jillguyonnet commented 7 months ago

Hey @jsoriano

There are indeed many of them. I wonder if any of them are doing the removal in a different way. With so many packages failing because of this we may need to think twice about the strategy to address this.

134 packages are failing this validation. By far the most common scenario seems to be that the pipeline has

  - rename:
      field: message
      target_field: event.original
      ignore_missing: true
      if: ctx.event?.original == null

but no processor to remove the message field. Instead, some have this:

  - remove:
      field: event.original
      if: "ctx?.tags == null || !(ctx.tags.contains('preserve_original_event'))"
      ignore_failure: true
      ignore_missing: true

I think adding an error code in order to make this validation skippable would be a very good approach. I was playing with it based on this example, but I'm not sure whether it could catch unwanted validation since it relies on regex. For example, the following 4 error messages could happen:

"field processors.1.rename: if is required"
"field processors.0: remove is required"
"field processors.2.remove.field: processors.2.remove.field does not match: \"message\""
"field processors.2.remove.if: processors.2.remove.if does not match: \"ctx.event?.original != null\""

WDYT?

Yes, as this is now you would need to add one package per test case, this is similar to the issue that I plan to address in https://github.com/elastic/package-spec/issues/676.

For this case maybe you can maybe create a new test with multiple test cases, each case would reference a pipeline, and then before the execution you create a package by copying a base package to a tmp dir, and then copy the specific pipeline to this package.

I've extracted the ingest pipeline validation tests in the latest commit. The only issue really was that the error message was wrong because in TestValidateFile there is only one file associated. The new test allows multiple files, so multiple pipelines per package. WDYT? Edit: CI fails but it passes locally because of path separator, will fix 👀

jillguyonnet commented 7 months ago

@jsoriano Added an error code in the last commit, let me know your thoughts 🙂 If we're ok with this, we only need to amend the version.

jillguyonnet commented 7 months ago

test integrations

elasticmachine commented 7 months ago

Created or updated PR in integrations repository to test this version. Check https://github.com/elastic/integrations/pull/8887

jillguyonnet commented 7 months ago

test integrations

elasticmachine commented 7 months ago

Created or updated PR in integrations repository to test this version. Check https://github.com/elastic/integrations/pull/8887

jillguyonnet commented 7 months ago

test integrations

elasticmachine commented 7 months ago

Created or updated PR in integrations repository to test this version. Check https://github.com/elastic/integrations/pull/8887

jillguyonnet commented 7 months ago

@jsoriano Thank you for the context and details around version change. I've changed it to 3.0.5 as requested.

Regarding the discussion around error wrapping, should we open a followup issue? Depending on the decision, it might involve other changes in the validator.

jillguyonnet commented 7 months ago

Awesome, thanks a lot @jsoriano and @mrodm for your help and comments!

jillguyonnet commented 7 months ago

Thank you @mrodm, done! Looks like the new commit dismissed your review though (also @jsoriano).

elasticmachine commented 7 months ago

:green_heart: Build Succeeded

History

cc @jillguyonnet