elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.35k stars 7.98k forks source link

[Fleet] Provide license info in telemetry #182150

Closed jillguyonnet closed 4 days ago

jillguyonnet commented 2 weeks ago

Summary

Relates https://github.com/elastic/ingest-dev/issues/2866

This PR adds license information to Fleet telemetry events stored in the fleet-usages* index. The license. issued_to field provides the customer name.

Testing locally

In a local setup, only the new license_issued_to field is defined.

Fleet usage logger:

  1. Change the interval of the FleetUsageLogger to e.g. 1m.
  2. Wait for the log line to show up:
    Fleet Usage: {"agents_enabled":true,"agents":{"total_enrolled":3,"healthy":3,"unhealthy":0,"offline":0,"inactive":0,"unenrolled":0,"total_all_statuses":3,"updating":0},"fleet_server":{"total_enrolled":1,"healthy":1,"unhealthy":0,"offline":0,"updating":0,"total_all_statuses":1,"num_host_urls":1},"license_issued_to":"elasticsearch"}

Fleet usage sender:

  1. Change the interval of the FleetUsageSender to e.g. 1m.
  2. Wait for the following log line:
    [2024-05-03T15:24:11.843+02:00][DEBUG][plugins.fleet] Fleet usage telemetry: {"agents_enabled":true,"agents":{"total_enrolled":3,"healthy":3,"unhealthy":0,"offline":0,"inactive":0,"unenrolled":0,"total_all_statuses":3,"updating":0},"fleet_server":{"total_enrolled":1,"healthy":1,"unhealthy":0,"offline":0,"updating":0,"total_all_statuses":1,"num_host_urls":1},"packages":[{"name":"system","version":"1.55.2","enabled":true},{"name":"synthetics","version":"1.2.1","enabled":false},{"name":"fleet_server","version":"1.5.0","enabled":true},{"name":"elastic_agent","version":"1.18.0","enabled":false},{"name":"nginx","version":"1.20.0","enabled":false}],"agent_checkin_status":{"error":0,"degraded":0},"agents_per_policy":[2,1],"agents_per_os":[{"name":"Ubuntu","version":"20.04.6 LTS (Focal Fossa)","count":3}],"fleet_server_config":{"policies":[{"input_config":{}}]},"agent_policies":{"count":3,"output_types":["elasticsearch"]},"agent_logs_panics_last_hour":[],"agent_logs_top_errors":[],"fleet_server_logs_top_errors":[],"license_issued_to":"elasticsearch"}

Upgrade sender:

  1. Change the interval of the FleetUsageSender to e.g. 1m.
  2. (Re)install a package and wait for the telemetry log (DEBUG level), which should contain license information:
    [2024-05-03T15:20:41.087+02:00][DEBUG][plugins.fleet.telemetry_events] [{"packageName":"nginx","currentVersion":"1.20.0","newVersion":"1.20.0","status":"success","dryRun":false,"eventType":"package-install","installType":"reinstall","errorMessage":[],"license_issued_to":"elasticsearch"}]

Checklist

apmmachine commented 2 weeks ago

:robot: GitHub comments

Expand to view the GitHub comments

Just comment with: - `/oblt-deploy` : Deploy a Kibana instance using the Observability test environments. - `run` `docs-build` : Re-trigger the docs validation. (use unformatted text in the comment!)

jillguyonnet commented 2 weeks ago

/ci

jillguyonnet commented 2 weeks ago

@elasticmachine merge upstream

jillguyonnet commented 2 weeks ago

/ci

jillguyonnet commented 2 weeks ago

@elasticmachine merge upstream

elasticmachine commented 2 weeks ago

Pinging @elastic/fleet (Team:Fleet)

juliaElastic commented 2 weeks ago

Is it possible to add “Organization ID” and “Deployment ID” too as requested here? https://github.com/elastic/ingest-dev/issues/2866#issuecomment-2014029639

jillguyonnet commented 2 weeks ago

Is it possible to add “Organization ID” and “Deployment ID” too as requested here? https://github.com/elastic/ingest-dev/issues/2866#issuecomment-2014029639

Hey @juliaElastic I've been struggling to get those via the internal clients, know if I might be missing anything? I've provisionally added the cluster info to the usage sender.

Also, I've checked for 3 things as detailed in the description (fleet usage sender, fleet usage logger, upgrade sender). I'm puzzled why changing the interval in FleetUsageSender doesn't seem to affect the actual running of the task though. Do you know what I'm doing wrong?

juliaElastic commented 2 weeks ago

I've been struggling to get those via the internal clients, know if I might be missing anything? I've provisionally added the cluster info to the usage sender.

I'll take a look.

I'm puzzled why changing the interval in FleetUsageSender doesn't seem to affect the actual running of the task though. Do you know what I'm doing wrong?

I think you have to increase the task version to let kibana pick up the change: https://github.com/elastic/kibana/blob/82f6ff093bd6d0928a789aa0d45dec33faf54d06/x-pack/plugins/fleet/server/services/telemetry/fleet_usage_sender.ts#L27

When I'm running your pr locally, I'm seeing cluster_info and license_info logged out, you could test in cloud by adding the ci:cloud-deploy label, to see what is there. Locally I don't see org id or deployment id.

[2024-05-03T13:38:15.602+02:00][DEBUG][plugins.fleet] Fleet usage telemetry: {"agents_enabled":true,"agents":{"total_enrolled":10,"healthy":2,"unhealthy":0,"offline":8,"inactive":0,"unenrolled":0,"total_all_statuses":10,"updating":0},"fleet_server":{"total_enrolled":7,"healthy":1,"unhealthy":0,"offline":6,"updating":0,"total_all_statuses":7,"num_host_urls":1},"packages":[{"name":"synthetics","version":"1.2.1","enabled":false},{"name":"system","version":"1.55.2","enabled":true},{"name":"fleet_server","version":"1.5.0","enabled":true},{"name":"apm","version":"8.13.0-SNAPSHOT","enabled":false},{"name":"elastic_agent","version":"1.19.0","enabled":false}],"agent_checkin_status":{"error":1,"degraded":0},"agents_per_policy":[7,3],"agents_per_os":[{"name":"macOS","version":"14.4.1","count":8},{"name":"Ubuntu","version":"20.04.6 LTS (Focal Fossa)","count":2}],"fleet_server_config":{"policies":[{"input_config":{}}]},"agent_policies":{"count":2,"output_types":["elasticsearch"]},"agent_logs_panics_last_hour":[],"agent_logs_top_errors":[],"fleet_server_logs_top_errors":[],"license_info":{"license":{"status":"active","uid":"15bf348c-01e9-40eb-9037-3e170b73d961","type":"trial","issue_date":"2024-05-02T08:39:17.624Z","issue_date_in_millis":1714639157624,"expiry_date":"2024-06-01T08:39:17.624Z","expiry_date_in_millis":1717231157624,"max_nodes":1000,"max_resource_units":null,"issued_to":"elasticsearch","issuer":"elasticsearch","start_date_in_millis":-1}},"cluster_info":{"name":"Julias-MacBook-Pro.local","cluster_name":"elasticsearch","cluster_uuid":"gn-gw9hdRp-PJzyECDL0Vg","version":{"number":"8.15.0-SNAPSHOT","build_flavor":"default","build_type":"tar","build_hash":"3e6df2630e40f0083b4ac68bbd932de2ce7e272f","build_date":"2024-04-25T13:05:24.400349590Z","build_snapshot":true,"lucene_version":"9.10.0","minimum_wire_compatibility_version":"7.17.0","minimum_index_compatibility_version":"7.0.0"},"tagline":"You Know, for Search"}}
[2024-05-03T13:38:15.603+02:00][ERROR][plugins.fleet] Error occurred while sending Fleet Usage telemetry: Error: Failed to validate payload coming from "Event Type 'fleet_usage'":
        - []: excess key 'license_info' found
        - []: excess key 'cluster_info' found

You also have to add new fields to the fleet_usages_schema.ts

jillguyonnet commented 2 weeks ago

I think you have to increase the task version to let kibana pick up the change:

Thank you, it works! Should we increase the version for this change?

juliaElastic commented 2 weeks ago

I think you have to increase the task version to let kibana pick up the change:

Thank you, it works! Should we increase the version for this change?

Yes.

juliaElastic commented 2 weeks ago

I found a way to add deploymentId to telemetry by using the cloud plugin like here: https://github.com/elastic/kibana/blob/75c7f1190df93502944e683e9a65c66cd8bf9294/x-pack/plugins/fleet/server/services/preconfiguration/fleet_server_host.ts#L31-L40 I didn't find organization id anywhere in kibana code.

I think we probably don't need cluster_info fields, and from licence_info it might be enough to send issued_to, to add only fields that will be used, @nimarezainia can confirm.

example ``` "license_info": { "license": { "status": "active", "uid": "15bf348c-01e9-40eb-9037-3e170b73d961", "type": "trial", "issue_date": "2024-05-02T08:39:17.624Z", "issue_date_in_millis": 1714639157624, "expiry_date": "2024-06-01T08:39:17.624Z", "expiry_date_in_millis": 1717231157624, "max_nodes": 1000, "max_resource_units": null, "issued_to": "elasticsearch", "issuer": "elasticsearch", "start_date_in_millis": -1 } }, "cluster_info": { "name": "Julias-MacBook-Pro.local", "cluster_name": "elasticsearch", "cluster_uuid": "gn-gw9hdRp-PJzyECDL0Vg", "version": { "number": "8.15.0-SNAPSHOT", "build_flavor": "default", "build_type": "tar", "build_hash": "3e6df2630e40f0083b4ac68bbd932de2ce7e272f", "build_date": "2024-04-25T13:05:24.400349590Z", "build_snapshot": true, "lucene_version": "9.10.0", "minimum_wire_compatibility_version": "7.17.0", "minimum_index_compatibility_version": "7.0.0" }, "tagline": "You Know, for Search" } ```

There are a few more fields in this CloudSetup type that might be useful, e.g. isElasticStaffOwned and serverless.projectId.

jillguyonnet commented 2 weeks ago

@elasticmachine merge upstream

nimarezainia commented 1 week ago

I found a way to add deploymentId to telemetry by using the cloud plugin like here:

https://github.com/elastic/kibana/blob/75c7f1190df93502944e683e9a65c66cd8bf9294/x-pack/plugins/fleet/server/services/preconfiguration/fleet_server_host.ts#L31-L40

I didn't find organization id anywhere in kibana code.

I think we probably don't need cluster_info fields, and from licence_info it might be enough to send issued_to, to add only fields that will be used, @nimarezainia can confirm. example

 "license_info": {
        "license": {
            "status": "active",
            "uid": "15bf348c-01e9-40eb-9037-3e170b73d961",
            "type": "trial",
            "issue_date": "2024-05-02T08:39:17.624Z",
            "issue_date_in_millis": 1714639157624,
            "expiry_date": "2024-06-01T08:39:17.624Z",
            "expiry_date_in_millis": 1717231157624,
            "max_nodes": 1000,
            "max_resource_units": null,
            "issued_to": "elasticsearch",
            "issuer": "elasticsearch",
            "start_date_in_millis": -1
        }
    },
    "cluster_info": {
        "name": "Julias-MacBook-Pro.local",
        "cluster_name": "elasticsearch",
        "cluster_uuid": "gn-gw9hdRp-PJzyECDL0Vg",
        "version": {
            "number": "8.15.0-SNAPSHOT",
            "build_flavor": "default",
            "build_type": "tar",
            "build_hash": "3e6df2630e40f0083b4ac68bbd932de2ce7e272f",
            "build_date": "2024-04-25T13:05:24.400349590Z",
            "build_snapshot": true,
            "lucene_version": "9.10.0",
            "minimum_wire_compatibility_version": "7.17.0",
            "minimum_index_compatibility_version": "7.0.0"
        },
        "tagline": "You Know, for Search"
    }

There are a few more fields in this CloudSetup type that might be useful, e.g. isElasticStaffOwned and serverless.projectId.

thank you @jillguyonnet and @juliaElastic for your efforts on this. The main goal of that issue is so that we are able to identify the end customer.

If deploymentID is available in the telemetry, for ESs customers, the only way I know how to get to the cluster is by looking up the deploymentID via admin.found.no. If we have the clusterID, the same mechanism is generally successful in mapping. Both will yield a cluster and we can navigate and find who owns the cluster.

License issued to (I think already exists in our telemetry) is less reliable unfortunately. It may resolve to "cloud" in cases where the license was issued via API (which seems often).

I will follow up over email with some other information that may help.

jillguyonnet commented 1 week ago

@nimarezainia In the context of the recent discussions, would there be any value in adding any of the already available fields? e.g. deployment_id, license.issued_to or any of the ones from license_info and cluster_info as listed in https://github.com/elastic/kibana/pull/182150#issuecomment-2092914189. If not, I will close this PR to allow focus on the longer term solution.

nimarezainia commented 1 week ago

@nimarezainia In the context of the recent discussions, would there be any value in adding any of the already available fields? e.g. deployment_id, license.issued_to or any of the ones from license_info and cluster_info as listed in #182150 (comment). If not, I will close this PR to allow focus on the longer term solution.

Yes please add those and we can perhaps call this issue closed for now. I believe that the deploymentID in particular at least would give us the opportunity to correctly identify the user/customer and take it from there. Thanks you again.

jillguyonnet commented 6 days ago

@elasticmachine merge upstream

jillguyonnet commented 6 days ago

Tested this on a cloud deployment:

Fleet Usage: {"agents_enabled":true,"agents":{"total_enrolled":2,"healthy":1,"unhealthy":0,"offline":1,"inactive":0,"unenrolled":0,"total_all_statuses":2,"updating":0},"fleet_server":{"total_enrolled":2,"healthy":1,"unhealthy":0,"offline":1,"updating":0,"total_all_statuses":2,"num_host_urls":1},"license_issued_to":"5abb01410b1a483e8bf3fa42bcd9e78c","deployment_id":"d91086e05fdb50ff76f6e2be522f539d"}

and from a package install:

[{"packageName":"elastic_agent","currentVersion":"not_installed","newVersion":"1.19.0","status":"success","dryRun":false,"eventType":"package-install","installType":"install","errorMessage":[],"license_issued_to":"5abb01410b1a483e8bf3fa42bcd9e78c","deployment_id":"d91086e05fdb50ff76f6e2be522f539d"}]

Does license.issued_to look right? @nimarezainia

nimarezainia commented 5 days ago

Tested this on a cloud deployment:

Fleet Usage: {"agents_enabled":true,"agents":{"total_enrolled":2,"healthy":1,"unhealthy":0,"offline":1,"inactive":0,"unenrolled":0,"total_all_statuses":2,"updating":0},"fleet_server":{"total_enrolled":2,"healthy":1,"unhealthy":0,"offline":1,"updating":0,"total_all_statuses":2,"num_host_urls":1},"license_issued_to":"5abb01410b1a483e8bf3fa42bcd9e78c","deployment_id":"d91086e05fdb50ff76f6e2be522f539d"}

and from a package install:

[{"packageName":"elastic_agent","currentVersion":"not_installed","newVersion":"1.19.0","status":"success","dryRun":false,"eventType":"package-install","installType":"install","errorMessage":[],"license_issued_to":"5abb01410b1a483e8bf3fa42bcd9e78c","deployment_id":"d91086e05fdb50ff76f6e2be522f539d"}]

Does license.issued_to look right? @nimarezainia

@jillguyonnet the deploymentID maps to cluster: kibana-pr-182150 does this sound right to you? Unfortunately the license_issued_to. is not yielding anything from the licensing portal nor salesforce.

@jlind23 I'm wondering if you see it differently.

jlind23 commented 5 days ago

@nimarezainia I found the same result. FWIW it is an internal deployment and a non paying organization under a trial status so it might explain why the license portal does not yield any result.

jillguyonnet commented 5 days ago

@nimarezainia kibana-pr-182150 is correct 👍 @jlind23 made a good point about the license. Shall we keep both then?

jlind23 commented 5 days ago

Shall we keep both then?

From my perspective, yes.

jillguyonnet commented 4 days ago

@elasticmachine merge upstream

kibana-ci commented 4 days ago

:green_heart: Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with: @elasticmachine merge upstream

cc @jillguyonnet