openzipkin / zipkin-support

repository for support questions raised as issues
4 stars 2 forks source link

java and C# and golang zipkin Dependencies show error? #14

Closed cell-data-x closed 4 years ago

cell-data-x commented 4 years ago

microservice invocation link image image I'm not sure why there is a topology between the two services,

codefromthecrypt commented 4 years ago

will need the json to answer this question. Please remove any private information from it.

cell-data-x commented 4 years ago
[{
    "traceId": "3327bbbcb0b75585",
    "parentId": "16dbaef206e1a325",
    "id": "1fffe453ee0cc741",
    "kind": "CLIENT",
    "name": "get",
    "timestamp": 1591597126907157,
    "localEndpoint": {
        "serviceName": "dal",
        "ipv4": "127.0.0.1",
        "ipv6": "::1"
    },
    "tags": {
        "http.method": "GET",
        "http.path": "/api/getPrivateColumns",
        "http.response.size": "1070",
        "http.status_code": "200"
    }
}, {
    "traceId": "3327bbbcb0b75585",
    "parentId": "16dbaef206e1a325",
    "id": "6138ab9e80192595",
    "kind": "CLIENT",
    "name": "post",
    "timestamp": 1591597128497981,
    "localEndpoint": {
        "serviceName": "dal",
        "ipv4": "127.0.0.1",
        "ipv6": "::1"
    },
    "tags": {
        "http.method": "POST",
        "http.path": "/api/v3/",
        "http.request.size": "402",
        "http.status_code": "307"
    }
}, {
    "traceId": "3327bbbcb0b75585",
    "parentId": "16dbaef206e1a325",
    "id": "1bfebb08a1e6413f",
    "kind": "CLIENT",
    "name": "post",
    "timestamp": 1591597128509918,
    "duration": 2992,
    "localEndpoint": {
        "serviceName": "dal",
        "ipv4": "127.0.0.1",
        "ipv6": "::1"
    },
    "tags": {
        "http.method": "POST",
        "http.path": "/api/v3",
        "http.request.size": "402",
        "http.response.size": "2092",
        "http.status_code": "200"
    }
}, {
    "traceId": "3327bbbcb0b75585",
    "parentId": "1bfebb08a1e6413f",
    "id": "421a3920708883a2",
    "kind": "CLIENT",
    "name": "mysql",
    "timestamp": 1591597128509918,
    "duration": 2992,
    "localEndpoint": {
        "serviceName": "rdbms",
        "ipv4": "127.0.0.1",
        "ipv6": "::1"
    },
    "tags": {
        "operate": "query",
        "result": "{\"ok\":true,\"err\":null,\"changes\":25,\"duration\":0,\"sql_duration\":2992100,\"data\":\"[......]\",\"ok_message\":\"\",\"trans_list\":null,\"private_cols\":{}}",
        "table": "dic_credential_type",
        "v3 mode": "{\"operate\":\"query\",\"table\":\"dic_credential_type\",\"alias\":\"d\",\"select\":[\"d.credential_type_id  as id\",\"d.credential_type_name as name\",\"d.credential_type_code code\",\"d.credential_type_id as d_credential_type_id\"],\"where\":{\"and\":[{\"eq\":{\"d.enable_mark\":1}},{\"eq\":{\"d.delete_mark\":0}}]},\"wherestring\":\" ( d.enable_mark = 1 and d.delete_mark = 0 ) \",\"wherebitmaptags\":{},\"object\":{},\"user_info\":{},\"operator_id\":0,\"Operates\":null,\"update_time\":\"0001-01-01T00:00:00Z\"}"
    }
}, {
    "traceId": "3327bbbcb0b75585",
    "parentId": "3327bbbcb0b75585",
    "id": "16dbaef206e1a325",
    "kind": "SERVER",
    "name": "get",
    "timestamp": 1591597117210994,
    "duration": 4908592,
    "localEndpoint": {
        "serviceName": "api",
        "ipv4": "192.168.6.225"
    },
    "tags": {
        "http.host": "192.168.6.225:9004",
        "http.path": "/api/Dictionary/GetCredentialDic",
        "http.uri": "http://192.168.6.225:9004/api/Dictionary/GetCredentialDic"
    },
    "shared": true
}, {
    "traceId": "3327bbbcb0b75585",
    "parentId": "3327bbbcb0b75585",
    "id": "16dbaef206e1a325",
    "kind": "CLIENT",
    "name": "get",
    "timestamp": 1591597120261000,
    "duration": 4976000,
    "localEndpoint": {
        "serviceName": "nethall-service",
        "ipv4": "192.168.255.6"
    },
    "tags": {
        "http.url": "/api/Dictionary/GetCredentialDic"
    }
}, {
    "traceId": "3327bbbcb0b75585",
    "id": "3327bbbcb0b75585",
    "kind": "SERVER",
    "name": "get",
    "timestamp": 1591597120242000,
    "duration": 5014000,
    "localEndpoint": {
        "serviceName": "nethall-service",
        "ipv4": "192.168.255.6"
    },
    "tags": {
        "http.status_code": "200",
        "http.url": "/api/Dictionary/GetCredentialDic"
    }
}]

image microservice invocation link image Dependencies

image

I'm not sure why there is a topology between the two services,

codefromthecrypt commented 4 years ago

it looks like you have some incorrect configuration. the first call from nethall-service -> api service seems correct. The calls labeled "dal" seem incorrect (like someone make a mistake in the service name configuration). Also it is weird that the mysql client span shows as a child of POST span.

I would look at what's creating the trace data labeled "dal" and attempt to stop that from setting the service name. The localEndpoint.serviceName is the name of the current process, not the name of the process you are calling.

cell-data-x commented 4 years ago

nethall-service -> api service-->dal service-->rdbms correct link . observe according to traceId,SpanId,ParentSpanId why topology nethall-service -->dal serverice?

cell-data-x commented 4 years ago

[ { "traceId":"47c24c9f4fe5c2f3", "parentId":"851e95437dd17bb6", "id":"684458875cb81a7f", "kind":"SERVER", "name":"post", "timestamp":1591603713455764, "localEndpoint":{ "serviceName":"dal", "ipv4":"127.0.0.1", "ipv6":"::1" }, "tags":{ "http.method":"POST", "http.path":"/api/v3/", "http.request.size":"402", "http.status_code":"307" } }, { "traceId":"47c24c9f4fe5c2f3", "parentId":"851e95437dd17bb6", "id":"3f88c27398719070", "kind":"SERVER", "name":"post", "timestamp":1591603713464750, "duration":2983, "localEndpoint":{ "serviceName":"dal", "ipv4":"127.0.0.1", "ipv6":"::1" }, "tags":{ "http.method":"POST", "http.path":"/api/v3", "http.request.size":"402", "http.response.size":"2092", "http.status_code":"200" } }, { "traceId":"47c24c9f4fe5c2f3", "parentId":"47c24c9f4fe5c2f3", "id":"851e95437dd17bb6", "kind":"SERVER", "name":"get", "timestamp":1591603704570230, "duration":3540074, "localEndpoint":{ "serviceName":"api", "ipv4":"192.168.6.225" }, "tags":{ "http.host":"192.168.6.225:9004", "http.path":"/api/Dictionary/GetCredentialDic", "http.uri":"http://192.168.6.225:9004/api/Dictionary/GetCredentialDic" }, "shared":true }, { "traceId":"47c24c9f4fe5c2f3", "parentId":"47c24c9f4fe5c2f3", "id":"851e95437dd17bb6", "kind":"CLIENT", "name":"get", "timestamp":1591603707671000, "duration":3559000, "localEndpoint":{ "serviceName":"nethall-service", "ipv4":"192.168.255.6" }, "tags":{ "http.url":"/api/Dictionary/GetCredentialDic" } }, { "traceId":"47c24c9f4fe5c2f3", "id":"47c24c9f4fe5c2f3", "kind":"SERVER", "name":"get", "timestamp":1591603707667000, "duration":3568000, "localEndpoint":{ "serviceName":"nethall-service", "ipv4":"192.168.255.6" }, "tags":{ "http.status_code":"200", "http.url":"/api/Dictionary/GetCredentialDic" } }]

topology call from nethall-service -->dal service nethall-service -->api service we according to x-b3 the Configure link tracking between services Client Span Server Span ┌──────────────────┐ ┌──────────────────┐ │ │ │ │ │ TraceContext │ Http Request Headers │ TraceContext │ │ ┌──────────────┐ │ ┌───────────────────┐ │ ┌──────────────┐ │ │ │ TraceId │ │ │ X-B3-TraceId │ │ │ TraceId │ │ │ │ │ │ │ │ │ │ │ │ │ │ ParentSpanId │ │ Inject │ X-B3-ParentSpanId │Extract │ │ ParentSpanId │ │ │ │ ├─┼─────────>│ ├────────┼>│ │ │ │ │ SpanId │ │ │ X-B3-SpanId │ │ │ SpanId │ │ │ │ │ │ │ │ │ │ │ │ │ │ Sampled │ │ │ X-B3-Sampled │ │ │ Sampled │ │ │ └──────────────┘ │ └───────────────────┘ │ └──────────────┘ │ │ │ │ │ └──────────────────┘ └──────────────────┘

codefromthecrypt commented 4 years ago

"dal" is wrong that's why. it isn't a service.

[server -> client] is the same service [client -> server] is valid to change service name

in your trace you have SERVER span for "api" as a parent to CLIENT span as "dal" this is invalid transition.

CLIENT span should be "api" or you are missing server span for when api changed to host "dal"

codefromthecrypt commented 4 years ago

I ask you about your instrumentation as it is almost surely wrong and fixing that is the best option.

for example, in java Brave. you can't make a mistake like this as serviceName is constant for the tracer object. meaning SERVER -> CLIENT cannot switch like you mention though CLIENT -> SERVER can.

cell-data-x commented 4 years ago

in you description image you mean api send TraceId,ParentSpanId,SpanId --> dal service parentId wrong ?

codefromthecrypt commented 4 years ago

I think your IDs are fine. I think the localEndpoint.serviceName is incorrect in the first trace you posted (with the kind = CLIENT spans with serviceName dal)

you can look at a more typical trace by running an example like these: https://github.com/openzipkin?q=example&type=&language= https://github.com/openzipkin-contrib?q=example&type=&language=

or POST some sample data like these: https://github.com/openzipkin/zipkin/tree/master/zipkin-lens/testdata

the main thing is that a trace like this.. should have only 2 localEndpoint.serviceName when a server receives a request and calls a client.. both the server and the client are same localEndpoint.serviceName

4 spans ([server, client] -> [server, client]) means 2 services, not 4 ([service A] -> [service B])

cell-data-x commented 4 years ago

I have a question Same language microservice for example in .net core
service a => service b =>service c service a (traceId ,parentid) = service b (traceId ,parentid)=server c (traceId ,parentid) Across languages for example api service(.net core) ---> dal service golang image

image

In my opinion, each service is both the client and the server, that is, after each sending to the subordinate service, the subordinate service should be sent back, so the subordinate service should be the same as my parentid.

api using .net core image

image

codefromthecrypt commented 4 years ago

for the future. please use chat for exploratory questions or possibly a different issue here when different types of questions https://gitter.im/openzipkin/zipkin

On the topic of service name, it is best to discuss this separate from trace/span IDs. Let's focus on the trace/span IDs as I already have discussed the proper service name convention.

There are two ways to handle client+server. Share the same span ID (mentioned below and usually default in zipkin) and choose not to share the same span ID. Regardless of which choice made the client should always be a different serviceName than the server unless the same service is calling itself (loopback) https://github.com/openzipkin/brave/blob/c87f7b89027f75acf7986e7f8065ca8a23ed2873/brave/README.md#sharing-span-ids-between-client-and-server

cell-data-x commented 4 years ago

[{ "traceId": "4091899dba8a8140", "parentId": "b84e933e3b39d48e", "id": "575267086f5c1cab", "kind": "CLIENT", "name": "get", "timestamp": 1591693686510693, "duration": 61546, "localEndpoint": { "serviceName": "unknownservice", "ipv4": "192.168.6.225" }, "tags": { "http.method": "GET", "http.path": "/api/getPrivateColumns" } }, { "traceId": "4091899dba8a8140", "parentId": "b84e933e3b39d48e", "id": "6e8a19147799bcfe", "kind": "CLIENT", "name": "post", "timestamp": 1591693686655892, "duration": 43554, "localEndpoint": { "serviceName": "unknownservice", "ipv4": "192.168.6.225" }, "tags": { "http.method": "POST", "http.path": "/api/v3" } }, { "traceId": "4091899dba8a8140", "id": "b84e933e3b39d48e", "kind": "SERVER", "name": "get", "timestamp": 1591693686472463, "duration": 367077, "localEndpoint": { "serviceName": "api", "ipv4": "192.168.6.225" }, "tags": { "http.host": "localhost:9004", "http.path": "/api/Config/GetSystemConfigs", "http.uri": "http://localhost:9004/api/Config/GetSystemConfigs" } }, { "traceId": "4091899dba8a8140", "parentId": "b84e933e3b39d48e", "id": "575267086f5c1cab", "kind": "SERVER", "name": "get", "timestamp": 1591693690459335, "duration": 997, "localEndpoint": { "serviceName": "dal", "ipv4": "192.168.6.12" }, "tags": { "http.method": "GET", "http.path": "/api/getPrivateColumns", "http.response.size": "1075", "http.status_code": "200" }, "shared": true }, { "traceId": "4091899dba8a8140", "parentId": "b84e933e3b39d48e", "id": "6e8a19147799bcfe", "kind": "SERVER", "name": "post", "timestamp": 1591693690607937, "localEndpoint": { "serviceName": "dal", "ipv4": "192.168.6.12" }, "tags": { "http.method": "POST", "http.path": "/api/v3/", "http.request.size": "263", "http.status_code": "307" }, "shared": true }, { "traceId": "4091899dba8a8140", "parentId": "6e8a19147799bcfe", "id": "4c8686b1ffe3f615", "kind": "CLIENT", "name": "mysql", "timestamp": 1591693690625890, "duration": 4987, "localEndpoint": { "serviceName": "dal", "ipv4": "192.168.6.12" }, "tags": { "operate": "query", "result": "{\"ok\":true,\"err\":null,\"changes\":34,\"duration\":0,\"sql_duration\":4986500,\"data\":\"[......]\",\"ok_message\":\"\",\"trans_list\":null,\"private_cols\":{}}", "table": "tab_system_config", "v3 mode": "{\"operate\":\"query\",\"in_dup_engine\":1,\"table\":\"tab_system_config\",\"alias\":\"s\",\"select\":[\"s.*\",\"s.config_id as s_config_id\"],\"where\":{\"eq\":{\"s.delete_mark\":0}},\"wherestring\":\"s.delete_mark = 0\",\"wherebitmaptags\":{},\"object\":{},\"user_info\":{},\"operator_id\":0,\"data_language\":\"zh-hant\",\"Operates\":null,\"update_time\":\"0001-01-01T00:00:00Z\"}" } }, { "traceId": "4091899dba8a8140", "parentId": "6e8a19147799bcfe", "id": "4c8686b1ffe3f615", "kind": "SERVER", "name": "mysql", "timestamp": 1591693690625890, "duration": 4987, "localEndpoint": { "serviceName": "rdbms", "ipv4": "192.168.6.12" }, "shared": true }, { "traceId": "4091899dba8a8140", "parentId": "b84e933e3b39d48e", "id": "6e8a19147799bcfe", "kind": "SERVER", "name": "post", "timestamp": 1591693690624893, "duration": 7979, "localEndpoint": { "serviceName": "dal", "ipv4": "192.168.6.12" }, "tags": { "http.method": "POST", "http.path": "/api/v3", "http.request.size": "263", "http.response.size": "26417", "http.status_code": "200" }, "shared": true }]

thank you for your kind answer .we api service change to

    public void RegisterZipkinTrace(IApplicationBuilder app, ILoggerFactory loggerFactory,
         IHostApplicationLifetime lifetime)
    {
        if (Program.SystemConfig != null && !string.IsNullOrEmpty(Program.SystemConfig.ZipkinUrl))
        {    
                app.UseDeveloperExceptionPage();             
                lifetime.ApplicationStarted.Register(() => {
                TraceManager.SamplingRate = 1.0f;
                var logger = new TracingLogger(loggerFactory, "zipkin4net");
                //(访问https://tracing-analysis.console.aliyun.com 获取zipkin endpoint, 注意endpoint中不包含“/api/v2/spans");
                var httpSender = new HttpZipkinSender($"http://{Program.SystemConfig.ZipkinUrl}", "application/json");

                var tracer = new ZipkinTracer(httpSender, new JSONSpanSerializer());
                var consoleTracer = new zipkin4net.Tracers.ConsoleTracer();
                TraceManager.RegisterTracer(tracer);
                TraceManager.RegisterTracer(consoleTracer);//控制台不输出
                TraceManager.Start(logger);
            });
            lifetime.ApplicationStopped.Register(() => TraceManager.Stop());
            app.UseTracing("api");
        }
    }

 public override void ConfigureServices(IServiceCollection services)
 {
     services.AddHttpClient("Tracer").AddHttpMessageHandler(provider =>
     TracingHandler.WithoutInnerHandler(provider.GetService<IConfiguration>() 
     ["applicationName"]));
 }

before, we manually sent X-B3-Traceid to the downstream service. now there is the unknownService image

the same is true for the official sample code that we tested 。can you help me with the reason?

codefromthecrypt commented 4 years ago

the argument to TracingHandler.WithoutInnerHandler() is the serviceName which if not found becomes unknown.

Is it possible that provider.GetService()["applicationName"]) is not returning a value?

cell-data-x commented 4 years ago

This is my code snippet。Is written as an official example。The official example has the same problem

 public void ConfigureServices(IServiceCollection services)
 {
        services.AddHttpClient("Tracer").AddHttpMessageHandler(provider =>
      TracingHandler.WithoutInnerHandler(provider.GetService<IConfiguration>()["api"]));
  }
codefromthecrypt commented 4 years ago

I won't be able to continue troubleshooting like this as I have some other things I have to do and I don't know zipkin4net dotnet etc and don't have time to try it right now.

replace "provider.GetService()["api"]" with a hard-coded name to test.

ex TracingHandler.WithoutInnerHandler("my-service")

if that works then you know there's a problem with the configuration provider and that particular aspect can be solved without knowing anything about tracing.

cell-data-x commented 4 years ago

Thank you, I have already hardcoded the solution