Field from the document for the data_stream.namespace

b2ronn commented 2 years ago

if I specify "${kubernetes.namespace}" in the namespace field, in the integration through kibana, then I get the error

{"log.level":"error","@timestamp":"2022-03-15T14:43:26.432Z","log.origin":{"file.name":"fleet/fleet_gateway.go","file.line":181},"message":"failed to dispatch actions, error: could not convert the configuration from the policy: missing field accessing 'output_permissions'","ecs.version":"1.6.0"}

how to specify variable for namespace?

ps: OpenShift version 4.9.10 Elasticsearch (ECK) Operator 2.1.0 provided by Elastic elastic/kibana/fleet/agent - 8.1

b2ronn commented 2 years ago

if i specify namespace as - $${kubernetes.namespace}

inputs:
- apm-server:
  data_stream:
    namespace: $${kubernetes.namespace}
...
output_permissions:
  default:
    _elastic_agent_checks:
      cluster:
      - monitor
    apm-1:
      cluster:
      - cluster:monitor/main
      indices:
      - names:
        - logs-apm.app-${kubernetes.namespace}
        privileges:
        - auto_configure
        - create_doc
      - names:
        - metrics-apm.app.*-${kubernetes.namespace}
        privileges:
        - auto_configure
        - create_doc

but the data doesn't come in that way.

ph commented 2 years ago

The problem you are experiencing is, fleet and fleet server are not able to generate the appropriate permissions for the targetted datastream, because using ${kubernetes.namespace} means that the number of data streams could explode based on the number of namespaces you are running. It should target metrics-apm.app.*-*

Before we evaluate if this is a feature we should support, I would like to know what is your current use case? How many namespace are you using? Could using processing and tagging would work?

@ruflin I don't remember if we ever discussed the possibility of having namespace as a purely dynamic value.

b2ronn commented 2 years ago

I have 3 agents with an apm-server in one namespace, and applications from other namespaces will write to it. in order to assign rights and know how much each datastream takes up space, I want the datastreams to be written to their namespace separately. now about 10 different applications from different namespaces write to apm, sometimes applications have the same name and they write to one datastream.

ruflin commented 2 years ago

I think what @b2ronn is trying to do is a very good example on how to use namespaces. We should figure out how to best solve this on the permission side. Few ideas:

Have a checkbox on a policy to give permissions for all data streams (metrics-*-*, logs-*-*) etc. instead of the locked down.
Detect that a variable is set as namespace and make the user aware, that now * is set to have the right permissions

As the permissions are set by Fleet, the solution of the problem much more likely lays with the Fleet team. Pulling in @joshdover / @jen-huang .

b2ronn commented 2 years ago

while beats (and apm server) I do through logstash

filter {
    mutate {
        replace  => {
        "[data_stream][type]" => "%{[data_stream.type]}"
        "[data_stream][dataset]" => "%{[data_stream.dataset]}"
        }
    }
    if [kubernetes][namespace] and [data_stream.namespace] == "default" {
      mutate {
        replace  => {
            "[data_stream][namespace]" => "%{[kubernetes][namespace]}"
        }
      }
    } else if [data_stream.namespace] != "default" {
      mutate {
        replace  => {
          "[data_stream][namespace]" => "%{[data_stream.namespace]}"
        }
      }
    } else {
      mutate {
        replace  => {
          "[data_stream][namespace]" => "default"
        }
      }
    }

    mutate {
      remove_field => [ "data_stream.type", "data_stream.dataset", "data_stream.namespace" ]
    }

joshdover commented 2 years ago

Right, we don't yet support variables in the namespace field and we'd need to adjust the permissions requested for API keys for Agents to write to logs-*-* etc and offer some UX warning the user about this.

I think we should tackle this as part of the input variables and conditions effort. @mostlyjason do we have a better tracking issue for this effort? All I could find is https://github.com/elastic/integrations/issues/1867

ruflin commented 2 years ago

I expect most input variables will not have an affect on the data stream names and with it the permissions. Because of this I would treat this as a special case to be covered.

mostlyjason commented 2 years ago

@joshdover Yes that is the only tracking issue for conditions support in Fleet. @akshay-saraswat and I are preparing a google doc proposal but its still WIP. I'll mention this as a use case in that proposal.

uvNikita commented 5 months ago

Is there any update/progress on this issue? We can't really migrate to Elastic Agents until we have a way to dynamically specify namespace field.

All our access controls to indexes are based on the Agent's proposed naming scheme (<type>-<dataset>-<namespace>) where users are given access to indexes based on *-*-<kubernetes_namespace> pattern.

felixbarny commented 5 months ago

Did you have a look at the Elasticsearch reroute processor? You can use it to route documents to a namespace, based on other properties of the document.

Example:

{
  "reroute": {
    "tag": "k8s_namespace",
    "if" : "ctx.kubernetes?.namespace != null",
    "namespace": "{{kubernetes.namespace}}"
  }
}

You can add this processor to the ...@custom ingest pipeline of the k8s integration.

uvNikita commented 5 months ago

@felixbarny I'll check the reroute processor out, thanks for the suggestion!

b2ronn commented 5 months ago

how to make a reroute for rum elastic-js agents? The problem is that it is not possible for them to specify kubernetes.namespace; also, from some version you disabled global labels for rum agents.

elastic / fleet-server

Field from the document for the data_stream.namespace #1227