openzipkin / zipkin-support

repository for support questions raised as issues
4 stars 2 forks source link

zipkin makes tags object disabled (enable = false) when creating indices in ES #46

Open shiyishuoshuo opened 3 years ago

shiyishuoshuo commented 3 years ago

Describe the Bug

I set up a zipkin server to write spans to ES (our own firm setup) for both QA and PROD, it is very odd that the index mapping created for both QA and Prod is different, and for Prod the tags object field enabled is false which prevents that field being indexed when creating index pattern on ES Kibana Dashboard, but I double checked the config and zipkin server jar and they are exactly the same only the environment is different, would you mind advising where might went wrong

Attached mapping definition for both qa and prod: Qa:

{
  "zipkin:span-2020-12-10": {
    "mappings": {
      "span": {
        "properties": {
          "_q": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "annotations": {
            "type": "object"
          },
          "duration": {
            "type": "long"
          },
          "id": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "kind": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "localEndpoint": {
            "properties": {
              "serviceName": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "remoteEndpoint": {
            "properties": {
              "ipv4": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "port": {
                "type": "long"
              }
            }
          },
          "tags": {
            "properties": {
              "env": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "error": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "http": {
                "properties": {
                  "status_code": {
                    "type": "text",
                    "fields": {
                      "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                      }
                    }
                  }
                }
              },
              "mvc": {
                "properties": {
                  "controller": {
                    "properties": {
                      "class": {
                        "type": "text",
                        "fields": {
                          "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                          }
                        }
                      },
                      "method": {
                        "type": "text",
                        "fields": {
                          "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                          }
                        }
                      }
                    }
                  }
                }
              },
              "userName": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "timestamp": {
            "type": "long"
          },
          "timestamp_millis": {
            "type": "long"
          },
          "traceId": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

Prod:

 {
  "zipkin:span-2020-12-10": {
    "mappings": {
      "span": {
        "_source": {
          "excludes": [
            "_q"
          ]
        },
        "dynamic_templates": [
          {
            "strings": {
              "match": "*",
              "match_mapping_type": "string",
              "mapping": {
                "ignore_above": 256,
                "norms": false,
                "type": "keyword"
              }
            }
          }
        ],
        "properties": {
          "_q": {
            "type": "keyword"
          },
          "annotations": {
            "type": "object",
            "enabled": false
          },
          "duration": {
            "type": "long"
          },
          "id": {
            "type": "keyword",
            "ignore_above": 256
          },
          "kind": {
            "type": "keyword",
            "ignore_above": 256
          },
          "localEndpoint": {
            "dynamic": "false",
            "properties": {
              "serviceName": {
                "type": "keyword"
              }
            }
          },
          "name": {
            "type": "keyword"
          },
          "remoteEndpoint": {
            "dynamic": "false",
            "properties": {
              "serviceName": {
                "type": "keyword"
              }
            }
          },
          **"tags": {
            "type": "object",
            "enabled": false
          },**
          "timestamp": {
            "type": "long"
          },
          "timestamp_millis": {
            "type": "date",
            "format": "epoch_millis"
          },
          "traceId": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

Also Attach the PROD version (failed version) of template definition:

{
    "zipkin:span_template": {
        "order": 0,
        "index_patterns": ["zipkin:span-*"],
        "settings": {
            "index": {
                "mapper": {
                    "dynamic": "false"
                },
                "requests": {
                    "cache": {
                        "enable": "true"
                    }
                },
                "number_of_shards": "5",
                "number_of_replicas": "1"
            }
        },
        "mappings": {
            "span": {
                "_source": {
                    "excludes": ["_q"]
                },
                "dynamic_templates": [{
                    "strings": {
                        "mapping": {
                            "type": "keyword",
                            "norms": false,
                            "ignore_above": 256
                        },
                        "match_mapping_type": "string",
                        "match": "*"
                    }
                }],
                "properties": {
                    "traceId": {
                        "type": "keyword",
                        "norms": false
                    },
                    "name": {
                        "type": "keyword",
                        "norms": false
                    },
                    "localEndpoint": {
                        "type": "object",
                        "dynamic": false,
                        "properties": {
                            "serviceName": {
                                "type": "keyword",
                                "norms": false
                            }
                        }
                    },
                    "remoteEndpoint": {
                        "type": "object",
                        "dynamic": false,
                        "properties": {
                            "serviceName": {
                                "type": "keyword",
                                "norms": false
                            }
                        }
                    },
                    "timestamp_millis": {
                        "type": "date",
                        "format": "epoch_millis"
                    },
                    "duration": {
                        "type": "long"
                    },
                    "annotations": {
                        "enabled": false
                    },
                    "tags": {
                        "enabled": false
                    },
                    "_q": {
                        "type": "keyword",
                        "norms": false
                    }
                }
            }
        },
        "aliases": {}
    }
}

QA version

{
zipkin:span_template: {
order: 0,
index_patterns: [
"zipkin:span-*"
],
settings: {
index: {
mapper: {
dynamic: "false"
},
requests: {
cache: {
enable: "true"
}
},
number_of_shards: "5",
number_of_replicas: "1"
}
},
mappings: {
span: {
properties: {
traceId: {
type: "keyword",
norms: false
},
annotations: {
enabled: true
},
tags: {
enabled: true
}
}
}
},
aliases: { }
}
}

QA and PROD are using different ES instance, but both are using ES 6.3 version, just want to understand how to make contents under tag object be indexed, we had some customized fields defined under tag eg who hit endpoint and which env, etc and we would like to leverage those fields for filter condition when we created Kibana Dashboard.

codefromthecrypt commented 3 years ago

our index template disables the normal tag indexing for reasons including restrictions on dots. We use an alternative approach here: https://github.com/openzipkin/zipkin/tree/master/zipkin-storage/elasticsearch#query-indexing

We have an example of another indexing approach here with "catch-all" used to layer a base policy. It seems to work on Elasticsearch 7.8 and 7.9, but currently breaks in 7.10 (for reasons unknown) https://github.com/openzipkin/zipkin/blob/master/zipkin-storage/elasticsearch/src/test/java/zipkin2/elasticsearch/integration/ITEnsureIndexTemplate.java#L50

you can look at https://github.com/openzipkin/zipkin/pull/3185 for background and cc @ccharnkij and @xeraa in case you know how or why ES 7.10 would break on this.. I will raise a PR to show that in a bit (just it is annoying as we have to build a temporary image with 7.10 to show it..)

shiyishuoshuo commented 3 years ago

@adriancole thank you so much for your reply. just want to know what is the reason behind it to disable the normal tag indexing? It seems like the workaround/alternative approach is only working for the ES version above 7.8, our version is still pretty old like ES 6.3, do you know any workaround for the lower version? And also do you know how the QA version of the index pattern came from? It seems like that version worked (indexing happened for tags).

Again thank you so much for your help

codefromthecrypt commented 3 years ago

The reason for special casing is around dotted name constraints. Tags are stored as a dictionary. Some keys include inconsistent number of dots (ex "error" and "error.message"). Elasticsearch cannot index these as it inteprets them as fields, and dots in fields imply an object path.

^^ is the summary of the why

Here is an example (no warranty) of using a secondary index template. I tested it just now https://gist.github.com/adriancole/1af1259102e7a2da1b3c9103565165d7

shiyishuoshuo commented 3 years ago

@adriancole thanks so much for your help, really appreciate it !!! But I guess we are kind of blocked as we are still using ES version 6.3. I just tried to create a secondary index template but no luck. One last thing I don't understand is it seems like we first tested in QA (which seems like using a different index template version)

{
zipkin:span_template: {
order: 0,
index_patterns: [
"zipkin:span-*"
],
settings: {
index: {
mapper: {
dynamic: "false"
},
requests: {
cache: {
enable: "true"
}
},
number_of_shards: "5",
number_of_replicas: "1"
}
},
mappings: {
span: {
properties: {
traceId: {
type: "keyword",
norms: false
},
annotations: {
enabled: true
},
tags: {
enabled: true
}
}
}
},
aliases: { }
}
}

and the template looks like below after hitting template endpoint from our team's ES instance: http://es.dev.controls-runtime.site.gs.com:12000/_template/zipkin:span_template

{
zipkin:span_template: {
order: 0,
index_patterns: [
"zipkin:span-*"
],
settings: {
index: {
mapper: {
dynamic: "false"
},
requests: {
cache: {
enable: "true"
}
},
number_of_shards: "5",
number_of_replicas: "1"
}
},
mappings: {
span: {
properties: {
traceId: {
type: "keyword",
norms: false
},
annotations: {
enabled: true
},
tags: {
enabled: true
}
}
}
},
aliases: { }
}
}

Does it ring a bell to you ? why in Prod I can't see this version ? thanks

codefromthecrypt commented 3 years ago

the trick here is that it is a secondary template, so yeah don't edit the primary one unless you replace the whole thing. We have a ES 6 image, and I switched the example to use it. It was able to work. I'm studying for a job interview so can't really dig more, I would suggest making sure the example works for you locally as it works 100% for me... upgrade your ES to a later version of 6 as ES won't support the version you are on anyway.

ex.

curl -s localhost:9200
{
  "name" : "xqWzkCi",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "qBO1LoybRvWw7SYwDJDNqA",
  "version" : {
    "number" : "6.8.13",
    "build_flavor" : "unknown",
    "build_type" : "unknown",
    "build_hash" : "be13c69",
    "build_date" : "2020-10-16T09:09:46.555371Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.3",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Then compare your host output with localhost:9200/_template from docker. Also remember you must setup templates this before spans are in your index. Good luck and if this doesn't work please talk to elasticsearch support as this is not something we have any control over.

xeraa commented 3 years ago

Trying to follow along here:

  1. The problem with 7.10 was a new deprecation notice (in https://github.com/openzipkin/zipkin/blob/master/zipkin-storage/elasticsearch/src/test/java/zipkin2/elasticsearch/integration/ITEnsureIndexTemplate.java#L50) and not something actually breaking, right?
  2. Seeing index_patterns: [ "zipkin:span-*" ] this must be for 6.x, since 7.0 and up don't support colons in index names.
  3. If it's a problem with Elasticsearch itself, discuss.elastic.co is a good place to discuss those.
codefromthecrypt commented 3 years ago

thanks @xeraa I pulled the ES 7.10 into a separate PR. I don't have time to diagnose but put an hour towards making it diagnosable (hopefully) https://github.com/openzipkin/zipkin/pull/3329

shiyishuoshuo commented 3 years ago

the trick here is that it is a secondary template, so yeah don't edit the primary one unless you replace the whole thing. We have a ES 6 image, and I switched the example to use it. It was able to work. I'm studying for a job interview so can't really dig more, I would suggest making sure the example works for you locally as it works 100% for me... upgrade your ES to a later version of 6 as ES won't support the version you are on anyway.

ex.

curl -s localhost:9200
{
  "name" : "xqWzkCi",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "qBO1LoybRvWw7SYwDJDNqA",
  "version" : {
    "number" : "6.8.13",
    "build_flavor" : "unknown",
    "build_type" : "unknown",
    "build_hash" : "be13c69",
    "build_date" : "2020-10-16T09:09:46.555371Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.3",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Then compare your host output with localhost:9200/_template from docker. Also remember you must setup templates this before spans are in your index. Good luck and if this doesn't work please talk to elasticsearch support as this is not something we have any control over.

hey @adriancole how are you doing? just follow up with you on above index template issue, actually in our QA env, we didn't set up any secondary template and it is kind of using secondary index template by default whereas in prod it is using the primary index template which is really odd. Actually we upgrade ES to later 6 version like 6.8 version but still can't manually create secondary index template. and I have already tested the ES image 6 on my local, actually the template is consistent with PROD version, don't know why in QA we saw a secondary version.