This PR changes the spec for two cloud.project.* metadata fields for GCP to be consistent with how Beats collect the same metadata. It was motivated by a user that is unable to correlate documents from APM agents and Beats on cloud.project.id.
It also clarifies a couple things with GCP metadata collection. Finally there is an open question about adding a new field.
details
A Google cloud project has three identifiers. The CLI lists them as:
% gcloud projects list
PROJECT_ID NAME PROJECT_NUMBER
my-project My Project 123456789012
Change cloud.project.id to use GCE metadata project.projectId, instead of project.numericProjectId (aka "project number"). This will match what Beats do. As well, the projectId field is a more visible identifier for the user -- it is more prominently shown in the Google Cloud console.
Stop collecting cloud.project.name. What Google describes as the "Project name" is not available in the GCE metadata response. The APM agents should not be attempting to make GCP API requests to get the project name.
Note: The instance.id from the GCE metadata is a big int. Your JSON parser could be surprised by it -- the JavaScript one gets it wrong, the Python one is fine. Please check.
Note: The previous description for how to get the cloud.region was wrong. (The Python implementation linked to earlier in the metadata.md doc is correct.)
open question
(This is for discussion. I have not proposed changing this in this PR.)
Should we also set cloud.account.id to the <GCE metadata>.project.projectId?
The cloud account or organization id used to identify different entities in a multi-tenant environment.
Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.
example: 666777888999
The "Examples: Google Cloud ORG Id" seems to indicate using projectId would be inappropriate. However, I think GCP uses the project (not some concept of the account) to "identify" resources.
Some examples from the GCE metadata (the globally unique identifier there is the numericProjectId):
This PR changes the spec for two
cloud.project.*
metadata fields for GCP to be consistent with how Beats collect the same metadata. It was motivated by a user that is unable to correlate documents from APM agents and Beats oncloud.project.id
.It also clarifies a couple things with GCP metadata collection. Finally there is an open question about adding a new field.
details
A Google cloud project has three identifiers. The CLI lists them as:
and the GCP docs describe them here: https://cloud.google.com/resource-manager/docs/creating-managing-projects#before_you_begin
proposed changes
cloud.project.id
to use GCE metadataproject.projectId
, instead ofproject.numericProjectId
(aka "project number"). This will match what Beats do. As well, theprojectId
field is a more visible identifier for the user -- it is more prominently shown in the Google Cloud console.cloud.project.name
. What Google describes as the "Project name" is not available in the GCE metadata response. The APM agents should not be attempting to make GCP API requests to get the project name.instance.id
from the GCE metadata is a big int. Your JSON parser could be surprised by it -- the JavaScript one gets it wrong, the Python one is fine. Please check.cloud.region
was wrong. (The Python implementation linked to earlier in the metadata.md doc is correct.)open question
(This is for discussion. I have not proposed changing this in this PR.)
Should we also set
cloud.account.id
to the<GCE metadata>.project.projectId
?Beats do.
ECS says:
The "Examples: Google Cloud ORG Id" seems to indicate using
projectId
would be inappropriate. However, I think GCP uses the project (not some concept of the account) to "identify" resources. Some examples from the GCE metadata (the globally unique identifier there is thenumericProjectId
):For comparison the OTel cloud semconv do not have
cloud.project.*
(perhaps because they are concepts particular to GCP and Azure). The do havecloud.account.id
and at least the OTel JS SDK sets that to the GCE metadataproject-id
value.I am curious if/where the ECS + OTel semconv merge lands on this.
checklist
CODEOWNERS
)/
schedule YYYY-MM-DD
to the PR description./schedule 2023-09-05