elastic / cloud-on-k8s

Elastic Cloud on Kubernetes
Other
2.52k stars 685 forks source link

TestFleetCustomLogsIntegrationRecipe is failing in 7.15.2 and 8.8.0 #5105

Open thbkrkr opened 2 years ago

thbkrkr commented 2 years ago
=== RUN   TestFleetCustomLogsIntegrationRecipe/ES_data_should_pass_validations
Retries (30m0s timeout): ..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
    step.go:43: 
            Error Trace:    utils.go:87
            Error:          Received unexpected error:
                            elasticsearch client failed for https://elasticsearch-d66t-es-http.e2e-saklj-mercury.svc:9200/_data_stream/logs-elastic_agent-default: 404 Not Found: {Status:404 Error:{CausedBy:{Reason: Type:} Reason:no such index [logs-elastic_agent-default] Type:index_not_found_exception RootCause:[{Reason:no such index [logs-elastic_agent-default] Type:index_not_found_exception}]}}
            Test:           TestFleetCustomLogsIntegrationRecipe/ES_data_should_pass_validations
{"log.level":"error","@timestamp":"2021-12-01T17:58:42.511Z","message":"stopping early","service.version":"0.0.0-SNAPSHOT+00000000","service.type":"eck","ecs.version":"1.4.0","error":"test failure","error.stack_trace":"github.com/elastic/cloud-on-k8s/test/e2e/test/helper.RunFile\n\t/go/src/github.com/elastic/cloud-on-k8s/test/e2e/test/helper/yaml.go:162\ngithub.com/elastic/cloud-on-k8s/test/e2e/agent.runAgentRecipe\n\t/go/src/github.com/elastic/cloud-on-k8s/test/e2e/agent/recipes_test.go:226\ngithub.com/elastic/cloud-on-k8s/test/e2e/agent.TestFleetCustomLogsIntegrationRecipe\n\t/go/src/github.com/elastic/cloud-on-k8s/test/e2e/agent/recipes_test.go:160\ntesting.tRunner\n\t/usr/local/go/src/testing/testing.go:1259"}
    --- FAIL: TestFleetCustomLogsIntegrationRecipe/ES_data_should_pass_validations (1800.00s)

Elastic Agent logs:

Performing setup of Fleet in Kibana                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     

21 x Kibana Fleet setup failed: http POST request to https://kibana-97f6-kb-http.e2e-mercury.svc:5601/api/fleet/setup fails: Package policy is invalid: inputs.logfile.streams.log.log.vars.paths: Log file path is required: <nil>. Response: {"statusCode":400,"error":"Bad Request","message":"Package policy is invalid: inputs.logfile.streams.log.log.vars.paths: Log file path is required"}

Error: http POST request to https://kibana-97f6-kb-http.e2e-mercury.svc:5601/api/fleet/setup fails: Package policy is invalid: inputs.logfile.streams.log.log.vars.paths: Log file path is required: <nil>. Response: {"statusCode":400,"error":"Bad Request","message":"Pac
kage policy is invalid: inputs.logfile.streams.log.log.vars.paths: Log file path is required"}                                                                                                                                                                              
For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/7.15/fleet-troubleshooting.html 

I did not see that it was to be expected. Relates to https://github.com/elastic/cloud-on-k8s/pull/4873#issuecomment-927792858. Relates to https://github.com/elastic/kibana/issues/113400.

thbkrkr commented 2 years ago
   xpack.fleet.packages:
    - name: log
      version: latest 
    xpack.fleet.agentPolicies:
    - name: Default Fleet Server on ECK policy
      is_default_fleet_server: true
      package_policies:
      - package:
          name: fleet_server
        name: fleet_server-1
    - name: Default Elastic Agent on ECK policy
      is_default: true
      unenroll_timeout: 900
      package_policies:
      - package:
          name: log
        name: log-1
        inputs:
        - type: logfile
          enabled: true
          streams:
          - data_stream:
              dataset: log.log
            enabled: true
            vars:
            - name: paths
              value:
              - '/var/log/containers/*${kubernetes.container.id}.log'
            - name: custom
              value: |
                symlinks: true
                condition: ${kubernetes.namespace} == 'default'

works in 7.15.1 and 7.16.0-SNAPSHOT but not in 7.15.2.

kpollich commented 2 years ago

@thbkrkr - can you clarify the impact this has on ECK for now? Is this is a blocker in any way for now? With 7.16 shipping next week and this issue being fixed in that release, wondering if it's possible to simply punt here for now.

Also, to close the loop a bit on our discussion over in https://github.com/elastic/kibana/issues/113400 (feel that it's better to move discussion to an open issue here), we do have an issue that causes preconfigured Custom Logs integration policies to fail to create in 7.15.2. We've been unable to find a root cause so far, but are continuing to investigate. We'd like to timebox these efforts if possible with 7.16 so close on the horizon.

For a minimal test case, add the following to kibana.yml in 7.15.2, start up Kibana, and load the /app/fleet page:

xpack.fleet.packages:
  - name: log
    version: latest
xpack.fleet.agentPolicies:
  - name: Custom Logs Policy
    id: custom-logs-123
    namespace: default
    package_policies:
      - package:
          name: log
        name: log-1
        inputs:
          - type: logfile
            enabled: true
            streams:
              - data_stream:
                  dataset: log.log
                enabled: true
                vars:
                  - name: paths
                    value:
                      - /var/log/my.log

You should receive the following error:

image

We're installing log-0.4.6 here, which can be referenced in EPR here: https://epr.elastic.co/package/log/0.4.6/. Adding custom logs via the Fleet UI is still possible, only preconfiguration fails.

Interestingly enough, I attempted to reproduce this in 7.15.2 with the nginx integration as well and did not encounter an error. nginx also includes a paths variable (see https://epr.elastic.co/package/nginx/1.2.0/), so I'm not sure why custom logs is broken while nginx is functional, but maybe this will be a helpful clue on our way to a root cause. Sample kibana.yml below.

xpack.fleet.packages:
  - name: nginx
    version: latest
xpack.fleet.agentPolicies:
  - name: Preconfigured Nginx
    id: preconfigured-nginx-123
    namespace: default
    package_policies:
      - package:
          name: nginx
        name: nginx-1
        inputs:
          - type: logfile
            enabled: true
            streams:
              - data_stream:
                  dataset: nginx.access
                enabled: true 
                vars:
                  - name: paths
                    value:
                      - /var/log/nginx/access.log*
thbkrkr commented 2 years ago

The impact on ECK is minor. This is not a blocker.

Le ven. 3 déc. 2021, 19:09, Kyle Pollich @.***> a écrit :

@thbkrkr https://github.com/thbkrkr - can you clarify the impact this has on ECK for now? Is this is a blocker in any way for now? With 7.16 shipping next week and this issue being fixed in that release, wondering if it's possible to simply punt here for now.

Also, to close the loop a bit on our discussion over in elastic/kibana#113400 https://github.com/elastic/kibana/issues/113400 (feel that it's better to move discussion to an open issue here), we do have an issue that causes preconfigured Custom Logs integration policies to fail to create in 7.15.2. We've been unable to find a root cause so far, but are continuing to investigate. We'd like to timebox these efforts if possible with 7.16 so close on the horizon.

For a minimal test case, add the following to kibana.yml in 7.15.2, start up Kibana, and load the /app/fleet page:

xpack.fleet.packages:

  • name: log version: latestxpack.fleet.agentPolicies:
  • name: Custom Logs Policy id: custom-logs-123 namespace: default package_policies:
    • package: name: log name: log-1 inputs:
      • type: logfile enabled: true streams:
        • data_stream: dataset: log.log enabled: true vars:
          • name: paths value:
            • /var/log/my.log

You should receive the following error:

[image: image] https://user-images.githubusercontent.com/6766512/144645576-67b05204-4936-4205-8a67-5f9d138fb95a.png

We're installing log-0.4.6 here, which can be referenced in EPR here: https://epr.elastic.co/package/log/0.4.6/. Adding custom logs via the Fleet UI is still possible, only preconfiguration fails.

Interestingly enough, I attempted to reproduce this in 7.15.2 with the nginx integration as well and did not encounter an error. nginx also includes a paths variable (see https://epr.elastic.co/package/nginx/1.2.0/), so I'm noot sure why custom logs is broken while nginx is functional, but maybe this will be a helpful clue on our way to a root cause. Sample kibana.yml below.

xpack.fleet.packages:

  • name: nginx version: latestxpack.fleet.agentPolicies:
  • name: Preconfigured Nginx id: preconfigured-nginx-123 namespace: default package_policies:
    • package: name: nginx name: nginx-1 inputs:
      • type: logfile enabled: true streams:
        • data_stream: dataset: nginx.access enabled: true vars:
          • name: paths value:
            • /var/log/nginx/access.log*

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/elastic/cloud-on-k8s/issues/5105#issuecomment-985725311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEOJY2RUIIGXX7UTMWHJUTUPEBVNANCNFSM5JHEQP2Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

thbkrkr commented 2 years ago

Closing as we decided to ignore TestFleetCustomLogsIntegrationRecipe in 7.15.2 because the underlying issue will not be resolved. We have pinned the test to 7.15.1 (#5108) and will skip ahead to7.16.0 (#5141).

barkbay commented 1 year ago

I'm reopening this issue as we have the same error with 8.8.0 :

[2023-05-30T11:38:37.209+00:00][WARN ][plugins.fleet] PackagePolicyValidationError:
Package policy is invalid: inputs.logfile.streams.log.logs.vars.paths:
Log file path is required
inputs.logfile.streams.log.logs.vars.data_stream.dataset: Dataset name is required
    at validatePackagePolicyOrThrow (/usr/share/kibana/node_modules/@kbn/fleet-plugin/server/services/package_policy.js:1197:13)
    at preconfigurePackageInputs (/usr/share/kibana/node_modules/@kbn/fleet-plugin/server/services/package_policy.js:1584:3)
    at /usr/share/kibana/node_modules/@kbn/fleet-plugin/server/services/preconfiguration.js:282:212
    at addPackageToAgentPolicy (/usr/share/kibana/node_modules/@kbn/fleet-plugin/server/services/agent_policy.js:776:53)
    at addPreconfiguredPolicyPackages (/usr/share/kibana/node_modules/@kbn/fleet-plugin/server/services/preconfiguration.js:282:53)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at ensurePreconfiguredPackagesAndPolicies (/usr/share/kibana/node_modules/@kbn/fleet-plugin/server/services/preconfiguration.js:219:7)
    at createSetupSideEffects (/usr/share/kibana/node_modules/@kbn/fleet-plugin/server/services/setup.js:86:7)
    at awaitIfPending (/usr/share/kibana/node_modules/@kbn/fleet-plugin/server/services/setup_utils.js:35:20)
    at /usr/share/kibana/node_modules/@kbn/fleet-plugin/server/plugin.js:285:9

Note that we have this new message: inputs.logfile.streams.log.logs.vars.data_stream.dataset: Dataset name is required

I tested various combinations including the one suggested here, without success:

      package_policies:
      - id: system-1
        name: system-1
        package:
          name: system
      - inputs:
        - enabled: true
          streams:
          - data_stream:
              dataset: log.log
            enabled: true
            vars:
            - name: paths
              value: /var/log/containers/*${kubernetes.container.id}.log
            - name: custom
              value: |
                symlinks: true
                condition: ${kubernetes.namespace} == 'default'
          type: logfile
        name: log-1
        package:
          name: log

@kpollich may I ask for help from you or a member of your team? Thanks 🙇

Note that this configuration works with 8.7.0, I'm a bit surprised to have a breaking change in the API between 2 minor versions.

aalagia90 commented 1 month ago

Hi, i have tried to install this solution cloud-on-k8s/config/recipes/elastic-agent/fleet-custom-logs-integration.yaml at main · elastic/cloud-on-k8s · GitHub but the package custom logs is not present in the elk policy. When the kibana start i have seen this error

Cattura

The version that i use of components are

kibana,elastic,fleet,elastic-agent : 8.13.2 eck operator: 2.12 log: 2.3.1

The configuration works until log 1.1.2 from 2.0.0 doesn't works

This is my yaml file

apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: kibana
spec:
  version: 8.13.2
  count: 1
  elasticsearchRef:
    name: elasticsearch
  config:
    xpack.fleet.agents.elasticsearch.hosts: ["https://elasticsearch-es-http.default.svc:9200"]
    xpack.fleet.agents.fleet_server.hosts: ["https://fleet-server-agent-http.default.svc:8220"]
    xpack.fleet.packages:
    - name: system
      version: latest
    - name: elastic_agent
      version: latest
    - name: fleet_server
      version: latest
    - name: log
      version: latest  
    xpack.fleet.agentPolicies:
    - name: Fleet Server on ECK policy
      id: eck-fleet-server
      namespace: default
      monitoring_enabled:
      - logs
      - metrics
      unenroll_timeout: 900
      package_policies:
      - name: fleet_server-1
        id: fleet_server-1
        package:
          name: fleet_server
    - name: Elastic Agent on ECK policy
      id: eck-agent
      namespace: default
      monitoring_enabled:
      - logs
      - metrics
      unenroll_timeout: 900
      package_policies:
      - name: system-1
        id: system-1
        package:
          name: system
      - package:
          name: log
        name: log-1
        inputs:
        - type: logfile
          enabled: true
          streams:
          - data_stream:
              dataset: log.log
            enabled: true
            vars:
            - name: paths
              value:
              - '/var/log/containers/*${kubernetes.container.id}.log'
            - name: custom
              value: |
                symlinks: true
                condition: ${kubernetes.namespace} == 'default'
---

Thank you very much

kpollich commented 1 month ago

Hey @barkbay sorry for not getting to this sooner. This came through while I was just starting parental leave last year, so it never got to me unfortunately. I took a quick look though since this got bumped.

In your example above, I see you're adding the dataset value to the system package policy, while the error is about the log package policy for custom logs. I wasn't able to come up with the exact right structure of the config here after a few attempts, though, so it's definitely possible there's a bug when configuring input packages via preconfiguration. If this is still an issue today I can definitely get this onto the team's board. I'll wait to hear back though in case this has been resolved elsewhere.