elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.67k stars 8.1k forks source link

[Fleet] Package policy variable values may not start with * or & #184972

Open efd6 opened 2 months ago

efd6 commented 2 months ago

It is not currently possible to send a integration var to kibana that starts with a literal * (or &) character. This prevents some configurations from being possible.

For example sending the following configuration

{
    "name": "snyk-audit_logs",
    "description": "",
    "namespace": "ep",
    "policy_id": "1f986eae-3dd4-48cf-b3eb-9df439f64e0f",
    "enabled": true,
    "output_id": "",
    "inputs": [
        {
            "policy_template": "snyk",
            "type": "cel",
            "enabled": true,
            "streams": [
                {
                    "id": "cel-snyk.audit_logs",
                    "enabled": true,
                    "data_stream": {
                        "type": "logs",
                        "dataset": "snyk.audit_logs"
                    },
                    "vars": {
                        "audit_id": {
                            "value": "*", ← HERE
                            "type": "text"
                        },
[...]
                    }
                }
            ],
            "vars": {
[...]
            }
        }
    ],
    "package": {
        "name": "snyk",
        "title": "Snyk",
        "version": "1.23.0"
    }
}

results in the following error response

{
    "statusCode": 400,
    "error": "Bad Request",
    "message": "Package policy is invalid: inputs.cel.streams.snyk.audit_logs.vars.audit_id: Strings starting with special YAML characters like * or & need to be enclosed in double quotes."
}

(stack version v8.13.0, using elastic-package)

Wrapping the character in quotes does not have the desired effect since the quotes then remain in the string.

This is a bloop meets floop problem.

/cc @kpollich

elasticmachine commented 2 months ago

Pinging @elastic/fleet (Team:Fleet)

criamico commented 1 month ago

I did some investigation on this bug, I'll add my findings here.

Context

The logic that prevents adding a variable starting with a * or & was put in place a long time ago to avoid incurring in errors when the integration policy is compiled to yaml to generate the elastic agent policy. This PR is where that validation was added and it explains pretty well why this was done: the yaml specification doesn't allow some special characters, unless they're wrapped in double quotes.

On that PR, the form validation logic was added here and actual template compile was added with this function , however later the compile logic was changed again but the form validation message remained as it was, resulting unclear for the user.

Repro steps

Having learned that, I did some tests to understand what should be the correct way to add variables with special characters. I used the variable from the ticket description but it applies to any variable.

Form error Screenshot 2024-07-04 at 12 15 58

I then added the variable with the double quotes in the format "*foo", as suggested by the validation message

Adding the value from the integration policy editor:

Screenshot 2024-07-04 at 11 46 35

Corresponding api call in kibana:

Screenshot 2024-07-04 at 12 05 32

Get the generated package policy from dev tools:

Screenshot 2024-07-04 at 11 48 19

Resulting full agent policy:

Screenshot 2024-07-04 at 12 02 48

The special character appears to be escaped in slightly different way, however the really important one is the final value in the generated full agent policy yaml (it has to comply with the yaml specification). That one is escaped with single quotes so it looks correct.

Conclusion

Since the underlying logic works fine, my proposal for this is

@nchaulet as you've worked previously in this area, could you help me confirm that what I wrote above is correct?

criamico commented 1 month ago

Had a discussion with @nchaulet and he pointed out that it should be responsibility of the integration developer to escape the variables, which is already done in many integrations (some examples are in this comment).

I'll close the PR and open instead a ticket to snyk devs to fix the var on their side.

criamico commented 1 month ago

Opened a ticket with snyk integration devs: https://github.com/elastic/integrations/issues/10395

criamico commented 1 month ago

we also discussed with @nchaulet that if the integrations correctly escape vars there should be no reason to disallow &* or in front of the variable. So the validation in xpack.fleet.packagePolicyValidation.quoteStringErrorMessage could be removed.

However, since there are still many integrations that don't use consistently the string_escape utility, I don't think we can easily get rid of that validation.

jlind23 commented 1 month ago

@criamico does it mean we should close this issue as won't fix or keep it open until all integrations consistently use the string_escape utility?

efd6 commented 1 month ago

The suggested fix was tried before I filed this. It did not work.

criamico commented 1 month ago

The suggested fix was tried before I filed this. It did not work.

@efd6 What type of issue did you observe?

criamico commented 1 month ago

does it mean we should close this issue as won't fix or keep it open until all integrations consistently use the string_escape utility?

@jlind23 For now I'd keep it open for another couple of days to investigate a little bit more. However, even if the string_escape adoption should increase, I don't think there would be much sense in keeping this issue open.

efd6 commented 1 month ago

I don't remember the details, but the issue stems from the fact that the bytes are sent through yaml and json, so the semantics of yaml cannot be preserved. I will repeat the analysis next week.

efd6 commented 1 month ago

I don't really understand why this should close. It is a limitation that is preventing expected behaviour and has no work-around.

criamico commented 1 month ago

Per offline discussion with @efd6, he found the error when working at https://github.com/elastic/integrations/pull/10073 and using * instead of ALL in this config:

input: cel
 service: snyk
 vars:
   url: http://{{Hostname}}:{{Port}}/
   api_token: xxxxxx
   ssl:
     verification_mode: none
   enable_request_tracer: true
 data_stream:
   vars:
     preserve_original_event: true
     audit_type: /rest/orgs/
     audit_id: "*"
 assert:
   hit_count: 11

I asked to provide more details about the type of issue he found and at what level it happened (when deploying the yaml to the elastic agent or before, etc). I also asked to report the exact issue found when adding the variable with double quotes "*" as suggested in kibana at the moment.

I think we need to investigate this issue a little bit more as at the moment is a bit unclear to me what's the exact problem with the current logic; the validation logic can be updated to escape the strings (similar to what was done originally on this PR) but we need to be careful not to break it for other integrations.

efd6 commented 1 month ago

This is the summary of the debugging. I can provide the complete details offline.

User Change Template Change Needs CEL Change Works Error Notes
None '{{escape_string audit_id}}' No No "Strings starting with special YAML characters like * or & need to be enclosed in double quotes."
None "{{escape_string audit_id}}" No No "Strings starting with special YAML characters like * or & need to be enclosed in double quotes."
Prefix all names with "_" '{{escape_string audit_id}}' Yes, remove "_" prefix. No "can not read a block mapping entry; a multiline key may not be an implicit key at line 13, column 12:\n api_token: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ... \n ^" Due to the helper making a ''string'' token.
Prefix all names with "p" '{{escape_string audit_id}}' Yes, remove "p" prefix. No "can not read a block mapping entry; a multiline key may not be an implicit key at line 13, column 12:\n api_token: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ... \n ^" As for previous.
Prefix all names with "_" "{{escape_string audit_id}}" Yes, remove "_" prefix. No 404 due to audit_id value being literally single quoted, "'_*'". Similar to above, but the double quote is stripped during deser.
Prefix all names with "_" "{{escape_string audit_id}}" Yes, remove "'_" prefix and "'" suffix. Yes Unreasonable to expect users to do this and brittle in CEL.
Prefix all names with "_" "{{audit_id}}" Yes, remove "_" prefix. Yes Unreasonable to expect users to do this and brittle with non-safe identifiers.

As far as I can see, with the implementation as it exists, there is no way to get this to work without users support. I do not believe that it is reasonable to expect that our users should have to do this to work around a design issue in internal communication between components.

nchaulet commented 1 month ago

I personally think we should fix the first scenario, and remove the validation (Strings starting with special YAML characters like * or & need to be enclosed in double quotes.) in Kibana. I think that validation was introduced before we introduced the escape_string helper and it's not relevant anymore. It's also the most user friendly solution

juliaElastic commented 1 month ago

If we removed the validation, would the special characters be passed to the agent policy correctly? Can we find out somehow if other packages need to use the escape_string helper too?

nchaulet commented 1 month ago

If we removed the validation, would the special characters be passed to the agent policy correctly?

Did a quick test and it seems special characters will be passed correctly if the integration use escape_string, (only tested the kibana part)

Screenshot 2024-07-09 at 8 30 37 AM

overall I think all the package that need variables starting with *, & will need to use escape_string

criamico commented 1 month ago

Did a quick test and it seems special characters will be passed correctly if the integration use escape_string, (only tested the kibana part)

I verified the same, but I'm worried of what happens right away when we remove it. What about the variables that don't use the escape_string, won't they break?

juliaElastic commented 1 month ago

I think we don't have values currently where special characters are not surrounded by quotes, because the validation prevented them from being added. It can happen though that after removing the validation, users add special characters which are not going to be escaped. It's going to be difficult to figure out which integration variables need the escape_string helper. Could Fleet apply it on every variable?

kpollich commented 1 month ago

I'm moving this out of the current sprint because the path forward here isn't quite clear. I think removing the validation is potentially risky if other integrations rely on how it works. Escaping all strings by default is worth exploring, but I worry that other integrations might rely on unescaped strings elsewhere.

nchaulet commented 1 month ago

think we don't have values currently where special characters are not surrounded by quotes, because the validation prevented them from being added. It can happen though that after removing the validation, users add special characters which are not going to be escaped. It's going to be difficult to figure out which integration variables need the escape_string helper. Could Fleet apply it on every variable?

Not really, Fleet does not know how the variable is used if it used as a yaml value or in a string concatenation, it's why I think it should be the template owner responsability to escape or not.

I think removing the validation is not that bad, it will fail when saving the package policy if the variable is not correctly escaped (when we compile the template), it will not be the best user experience, maybe we can improve the error message, but user will know

criamico commented 1 month ago

it will fail when saving the package policy if the variable is not correctly escaped (when we compile the template)

I did a quick test removing validation as suggested by @nchaulet for a variable that doesn't use escape_string, it fails with this error: Screenshot 2024-07-10 at 16 12 59

And here's the stacktrace:

                  ^
[2024-07-10T16:12:23.377+02:00][ERROR][plugins.fleet] YAMLException: unidentified alias "foo" at line 11, column 16:
      user_id: *foo
                   ^
    at generateError (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:167:10)
    at throwError (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:173:9)
    at readAlias (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1276:5)
    at composeNode (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1368:20)
    at readBlockMapping (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1089:11)
    at composeNode (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1359:12)
    at readBlockMapping (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1089:11)
    at composeNode (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1359:12)
    at readDocument (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1525:3)
    at loadDocuments (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1588:5)
    at load (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1614:19)
    at safeLoad (/Users/cristina/Dev/kibana/node_modules/js-yaml/lib/js-yaml/loader.js:1637:10)
    at compileTemplate (agent.ts:31:44)
    at _compilePackageStream (package_policy.ts:2320:31)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Promise.all (index 0)
    at _compilePackageStreams (package_policy.ts:2200:10)
    at package_policy.ts:2130:29
    at async Promise.all (index 0)
    at PackagePolicyClientWithAuthz.create (package_policy.ts:309:16)
    at createPackagePolicyHandler (handlers.ts:287:27)
    at core_versioned_route.ts:192:22
    at Router.handle (router.ts:268:30)
    at handler (router.ts:196:13)
    at exports.Manager.execute (/Users/cristina/Dev/kibana/node_modules/@hapi/hapi/lib/toolkit.js:60:28)
    at Object.internals.handler (/Users/cristina/Dev/kibana/node_modules/@hapi/hapi/lib/handler.js:46:20)
    at exports.execute (/Users/cristina/Dev/kibana/node_modules/@hapi/hapi/lib/handler.js:31:20)
    at Request._lifecycle (/Users/cristina/Dev/kibana/node_modules/@hapi/hapi/lib/request.js:371:32)
    at Request._execute (/Users/cristina/Dev/kibana/node_modules/@hapi/hapi/lib/request.js:281:9)

So it breaks at the safeLoad level, similar to what we've seen in other cases.

Maybe we could handle better the error, but it would still be a breaking change for many integrations, as there could be many legitimate cases where users are actively using the double quotes to escape their variable. So I think that this require some careful planning anyway.

juliaElastic commented 1 month ago

The error should only come if the value is not in quotes, isn't it the case? Instead of the validation error, we could add a warning to recommend users to still use quotes, but not prevent saving with special characters if they know what they are doing in cases like snyk. We could also make the safeLoad error more user friendly.