elastic / apm

Elastic Application Performance Monitoring - resources and general issue tracking for Elastic APM.
https://www.elastic.co/apm
Apache License 2.0
384 stars 114 forks source link

Change 'metadata.cloud.project.*' for GCP, clarify other fields #822

Closed trentm closed 1 year ago

trentm commented 1 year ago

This PR changes the spec for two cloud.project.* metadata fields for GCP to be consistent with how Beats collect the same metadata. It was motivated by a user that is unable to correlate documents from APM agents and Beats on cloud.project.id.

It also clarifies a couple things with GCP metadata collection. Finally there is an open question about adding a new field.

details

A Google cloud project has three identifiers. The CLI lists them as:

% gcloud projects list
PROJECT_ID    NAME          PROJECT_NUMBER
my-project    My Project    123456789012

and the GCP docs describe them here: https://cloud.google.com/resource-manager/docs/creating-managing-projects#before_you_begin

proposed changes

  1. Change cloud.project.id to use GCE metadata project.projectId, instead of project.numericProjectId (aka "project number"). This will match what Beats do. As well, the projectId field is a more visible identifier for the user -- it is more prominently shown in the Google Cloud console.
  2. Stop collecting cloud.project.name. What Google describes as the "Project name" is not available in the GCE metadata response. The APM agents should not be attempting to make GCP API requests to get the project name.
  3. Note: The instance.id from the GCE metadata is a big int. Your JSON parser could be surprised by it -- the JavaScript one gets it wrong, the Python one is fine. Please check.
    > JSON.parse('{"id": 7774572792595385001}')
    { id: 7774572792595385000 } // oops
  4. Note: The previous description for how to get the cloud.region was wrong. (The Python implementation linked to earlier in the metadata.md doc is correct.)

open question

(This is for discussion. I have not proposed changing this in this PR.)

Should we also set cloud.account.id to the <GCE metadata>.project.projectId?

Beats do.

ECS says:

The cloud account or organization id used to identify different entities in a multi-tenant environment.

Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.

example: 666777888999

The "Examples: Google Cloud ORG Id" seems to indicate using projectId would be inappropriate. However, I think GCP uses the project (not some concept of the account) to "identify" resources. Some examples from the GCE metadata (the globally unique identifier there is the numericProjectId):

machineType: 'projects/123456789012/machineTypes/e2-micro',
  network: 'projects/123456789012/networks/default',
  email: '123456789012-compute@developer.gserviceaccount.com',

For comparison the OTel cloud semconv do not have cloud.project.* (perhaps because they are concepts particular to GCP and Azure). The do have cloud.account.id and at least the OTel JS SDK sets that to the GCE metadata project-id value.

I am curious if/where the ECS + OTel semconv merge lands on this.

checklist

/schedule 2023-09-05

trentm commented 1 year ago

See also this Java APM agent issue: https://github.com/elastic/apm-agent-java/issues/3250