Open tonglil opened 3 years ago
Here's more anomalies:
The concept of zones use different tags for each cloud provider:
cloud | name |
---|---|
azure | availability_zone |
aws | availability-zone |
gcp | zone |
The region
tag is collected from Azure and AWS, however not from all GCP integrations. Some have it, but then some don't, like GAE, cloud nats, etc...
For select integrations it's named location
instead, like cloud run, spanner, memcache, cloud tasks, dns, etc...
Tap, tap, tap... is this thing on? Just wondering if there's any plans to address this at all?
It makes it /really/ hard to build dashboards if the tagging isn't consistent.
@edwardaux Thanks for feedback. Have you raised this issue/open a ticket through the support channels? If not, I would advise that you do because most of the tagging issues you mentioned are not done by the agent (except maybe for the clustername in a k8s env); there are different teams involved with Crawler/Web/Cloud-based integrations that have no dependencies to the Datadog Agent (this repository). That said, a support ticket might be a better way for this issue to get more traction and have it routed to teams responsible.
@ian28223 I'm sorry but Datadog is a complete product, and as customers of this product it is surprising to me that this kind of interweaving and crosscutting issue is not of interest to be addressed.
Furthermore, asking customers to open tickets when employees can do so much easier is just shocking to me. Tickets opened often end up closed as a "thanks we'll file this as a feature request" with no accountability. Sharing here provides visibility to other customers that they're not the only ones having issues with that process and the product itself.
Lastly, it's unfortunate to see redirecting the responsibility and involvement of the Datadog agent. As I clearly linked to in the original comment, there are multiple places where tags are set or used by the agent.
The "different teams involved with Crawler/Web/Cloud-based integrations that have no dependencies to the Datadog Agent" together make the user experience for this product and the issues outlined result in a poor user experience.
I recommend Datadog (or the Datadog agent team) to revaluate the "somebody's else's problem" approach it takes to this kind of tech debt.
I'd like to add that this is very annoying because there are interfaces in Datadog that assume one type of tag. For instance, loading up the infrastructure map with gcp and gke looks like this by default:
Some of the group drop down options also don't work because they are named different in gcp:
I think an acceptable stop gap would be to allow aliasing tags, so if you were running in gcp and had a tag like cluster-location:us-east1
you could alias it to create region:us-east1
.
I understand and I agree. I have raised this to the relevant team's PM. Proper tracking would still be via official support channels.
Here's more anomalies:
AZs
The concept of zones use different tags for each cloud provider:
cloud name azure
availability_zone
awsavailability-zone
gcpzone
Regions
The
region
tag is collected from Azure and AWS, however not from all GCP integrations. Some have it, but then some don't, like GAE, cloud nats, etc...For select integrations it's named
location
instead, like cloud run, spanner, memcache, cloud tasks, dns, etc...
Note that for GCP, location is not the same as region. They're two separate attributes that often have the same value, but they're not the same.
See Cloud Storage Locations for example https://cloud.google.com/storage/docs/locations
For compute, it's still zones and regions.
https://cloud.google.com/compute/docs/regions-zones
https://cloud.google.com/docs/geography-and-regions
location is not the same as region
Correct, they are supplementary to each other. While GCS doesn't allow you to choose a specific zone, they still have regions (which is conceptually shared across other products like compute) in addition to multi-region codes (aka locations).
For cloud run, you can only specify region (no zone or multi region).
gcloud storage buckets create gs://BUCKET_NAME --location=US --placement=US-CENTRAL1,US-EAST1
DD should collect tags for GCP's zone, region, and location as applicable - and unify them (or allow us to rename them so) to deal with multicloud setups.
Describe what happened:
Currently, the tags Datadog collects and applies to cloud resources are 1) inconsistent and 2) varied.
1. For example, there is no convention or standardization between
-
and_
Please unify them on one or the other.
For GCP, these are the tags collected:
For AWS:
In this case, there is
zone
(GCP) andavailability-zone
(AWS), causing duplication & multiplication of tags (and therefore custom metrics) if you want to unify them across clouds (ie one app deployed in both)_
and-
within the same category2. Duplicated tags from integrations
Along with the above, different integration levels collect the same tags with different keys, resulting in examples like this (GCP & K8s):
I have no idea which one did what because
cluster_name
based onDD_CLUSTER_NAME
3. Another naming oddity
4.
cloud_provider
tag is inconsistently automatically appliedFor agents running in GCP, the
cloud_provider:gcp
tag is automatically added to all things. However based on a chat with a support agent, thecloud_provider:aws
tag is not automatically added for AWS:This is inconsistent behavior
5.
aws_account
tag is inconsistently applied to AWS metricsSome AWS metrics collected by the AWS integration are automatically tagged with the account number, while others lack this tag. For example, ELB metrics have this tag, but EC2 metrics do not.
This makes filtering on this tag value difficult when building a multi-account dashboard.
This is not a problem with GCP as all metrics are tagged with
project_id
.Describe what you expected:
I expect:
_
or-
(probably_
since you automatically convert camelCase to underscores already)volume_id
instance_id
)volume_name
instance_name
cluster_name
service_name
)zone
)This could be easier if I could actually configure how tags are formatted or configured.
It doesn't seem like the code allows for that right now since it is hardcoded: https://github.com/DataDog/datadog-agent/blob/eb4fa9ce5edc05fd2ac61df19ce5b98b9f727b35/pkg/util/gce/gce_tags.go#L73-L92
It would also be helpful if I can pick and choose which tags are collected, but that's not possible either.
I ask for this because building multi/cross cloud queries & dashboards are not trivial.
This seems like a sensible thing to do (review tag names and make them conform to some "datadog-internal" standard) so user's don't have a poor experience when trying to correlate data from multiple sources.