openzipkin / zipkin-support

repository for support questions raised as issues
4 stars 2 forks source link

Client annotation address is unknown in lens #16

Closed rmannibucau closed 4 years ago

rmannibucau commented 4 years ago

Describe the Bug + Steps to reproduce

Generate a client span with a remote endpoint correctly setup - with a parent server span indeed. Go in lens to see the span. Open the annotations of the client span, Address is "unknown" instead of the IP of the remote endpoint.

Expected Behaviour

Adress should be the one of the remote endpoint.

It seems it comes from https://github.com/openzipkin/zipkin/blob/master/zipkin-lens/src/zipkin/span-row.js#L58, line should likely be:

const localFormatted = formatEndpoint(span.localEndpoint) || formatEndpoint(span.remoteEndpoint) || undefined;

(indeed you can add validation on "kind" if you feel it better)

codefromthecrypt commented 4 years ago

thanks for the description. Can you include trace json (format with ```json or use https://fluffy.cc/ or a gist). Also screen-shot the problem.

This is more likely to guarantee fix vs probably fix (also easier for the person to fix)

rmannibucau commented 4 years ago

Here are the spans:

[
    [
        {
            "annotations": [
                {
                    "timestamp": 1592546218881000,
                    "value": "cr"
                },
                {
                    "timestamp": 1592546218882000,
                    "value": "cs"
                }
            ],
            "binaryAnnotations": [
                {
                    "key": "http.status_code",
                    "type": 6,
                    "value": "210"
                },
                {
                    "key": "component",
                    "type": 6,
                    "value": "test"
                },
                {
                    "key": "span.kind",
                    "type": 6,
                    "value": "CLIENT"
                },
                {
                    "key": "http.url",
                    "type": 6,
                    "value": "http://here"
                },
                {
                    "key": "http.method",
                    "type": 6,
                    "value": "GET"
                }
            ],
            "duration": 1000,
            "id": "812dcc1ed67e9f64",
            "kind": "CLIENT",
            "name": "https://foo.test:1234/trace?junit5=true",
            "parentId": "4a4a92895c4ca53d",
            "remoteEndpoint": {
                "ipv4": "1.2.4.5",
                "port": 1234,
                "serviceName": "test"
            },
            "timestamp": 1592546218881000,
            "traceId": "47cec41e2dfe8b03"
        }
    ],
    [
        {
            "annotations": [
                {
                    "timestamp": 1592546218881000,
                    "value": "sr"
                },
                {
                    "timestamp": 1592546222108000,
                    "value": "ss"
                }
            ],
            "binaryAnnotations": [
                {
                    "key": "http.status_code",
                    "type": 6,
                    "value": "210"
                },
                {
                    "key": "component",
                    "type": 6,
                    "value": "test"
                },
                {
                    "key": "span.kind",
                    "type": 6,
                    "value": "SERVER"
                },
                {
                    "key": "http.url",
                    "type": 6,
                    "value": "https://foo.test:1234/trace?junit5=true"
                },
                {
                    "key": "http.method",
                    "type": 6,
                    "value": "GET"
                }
            ],
            "duration": 3227000,
            "id": "4a4a92895c4ca53d",
            "kind": "SERVER",
            "localEndpoint": {
                "ipv4": "1.2.3.4",
                "port": 1234,
                "serviceName": "test"
            },
            "name": "https://foo.test:1234/trace?junit5=true",
            "timestamp": 1592546218881000,
            "traceId": "47cec41e2dfe8b03"
        }
    ]
]

here is a screenshot (not this exact span - timestamps/names are different but it is the same code, just used the test to capture the span cause it was esier/already setup)

image

codefromthecrypt commented 4 years ago

this trace is very strange. why is it mixing v1 format annotations (cs cr etc) with v2 fields like remoteEndpoint? what's creating this? in v2 format "cs" "cr" etc are replaced by span.kind

https://zipkin.io/zipkin-api/#/default/post_spans

rmannibucau commented 4 years ago

@adriancole yes but it does not help much the lens issue - it could probbaly be custom annotations too. But here is another span - cleaner hopefully - which does not have annotations but leads to the same bug:

[
    {
        "duration": 15000,
        "id": "0377054e997861e9",
        "kind": "CLIENT",
        "name": "http://localhost:9000/mw-cx/customer-api",
        "parentId": "c30abf4e21ac0923",
        "remoteEndpoint": {
            "ipv4": "172.19.0.6",
            "port": 8080,
            "serviceName": "meecrowave"
        },
        "tags": {
            "http.status_code": "200",
            "component": "test",
            "span.kind": "CLIENT",
            "http.url": "HTTP://test-demo-api-customer-1:8080/test-demo/customer",
            "http.method": "GET"
        },
        "timestamp": 1592552741510000,
        "traceId": "40de0b6657964582"
    },
    {
        "duration": 2646000,
        "id": "c30abf4e21ac0923",
        "kind": "SERVER",
        "localEndpoint": {
            "ipv4": "172.19.0.6",
            "port": 8080,
            "serviceName": "meecrowave"
        },
        "name": "http://localhost:9000/mw-cx/customer-api",
        "tags": {
            "http.status_code": "200",
            "component": "test",
            "span.kind": "SERVER",
            "http.url": "http://localhost:9000/mw-cx/customer-api",
            "http.method": "GET"
        },
        "timestamp": 1592552741510000,
        "traceId": "40de0b6657964582"
    }
]

with this rendering:

image

codefromthecrypt commented 4 years ago

thanks but the data is still incorrect. the local endpoint defines the field you are asking about.

you really should inquire what's making this data and try to get them to map correctly.

On Fri, Jun 19, 2020, 3:48 PM Romain Manni-Bucau notifications@github.com wrote:

@adriancole https://github.com/adriancole yes but it does not help much the lens issue - it could probbaly be custom annotations too. But here is another span - cleaner hopefully - which does not have annotations but leads to the same bug:

[ { "duration": 15000, "id": "0377054e997861e9", "kind": "CLIENT", "name": "http://localhost:9000/mw-cx/customer-api", "parentId": "c30abf4e21ac0923", "remoteEndpoint": { "ipv4": "172.19.0.6", "port": 8080, "serviceName": "meecrowave" }, "tags": { "http.status_code": "200", "component": "test", "span.kind": "CLIENT", "http.url": "HTTP://test-demo-api-customer-1:8080/test-demo/customer", "http.method": "GET" }, "timestamp": 1592552741510000, "traceId": "40de0b6657964582" }, { "duration": 2646000, "id": "c30abf4e21ac0923", "kind": "SERVER", "localEndpoint": { "ipv4": "172.19.0.6", "port": 8080, "serviceName": "meecrowave" }, "name": "http://localhost:9000/mw-cx/customer-api", "tags": { "http.status_code": "200", "component": "test", "span.kind": "SERVER", "http.url": "http://localhost:9000/mw-cx/customer-api", "http.method": "GET" }, "timestamp": 1592552741510000, "traceId": "40de0b6657964582" } ]

with this rendering:

[image: image] https://user-images.githubusercontent.com/1249546/85109597-db339380-b211-11ea-8dd5-915f8f33e684.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openzipkin/zipkin/issues/3122#issuecomment-646493344, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPVV4WYJSMZ7MLULOOROLRXMJ55ANCNFSM4OCCSIEQ .

rmannibucau commented 4 years ago

@adriancole hmm, think I'm not following you, in all cases lens use local endpoint but SERVER and CLIENT spans don't use localEndpoint so in one case it is wrong anyway whatever until the data you send to zipkin are always using localendpoint which is quite unlikely, no?

codefromthecrypt commented 4 years ago

localEndpoint is the thing that recorded the data. it is only optional in the case of patching an existing span.

remoteEndpoint is the other side and optional. usually only the IP:port but service name also when known.

tracers typically initialize a constant value for localEndpoint and use it for all spans. that is why I am curious what is creating this data as it looks strange.

you can look at this if helps https://github.com/openzipkin/zipkin-api/blob/master/zipkin2-api.yaml#L483

On Fri, Jun 19, 2020, 5:53 PM Romain Manni-Bucau notifications@github.com wrote:

@adriancole https://github.com/adriancole hmm, think I'm not following you, in all cases lens use local endpoint but SERVER and CLIENT spans don't use localEndpoint so in one case it is wrong anyway whatever until the data you send to zipkin are always using localendpoint which is quite unlikely, no?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openzipkin/zipkin-support/issues/16#issuecomment-646546938, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPVV6CS7BSLCZVCSMG4YDRXMYTFANCNFSM4OCSHXBA .

rmannibucau commented 4 years ago

@adriancole the generator is a custom tracer but having local and remote endpoints at the same time is weird and created issues at some points so not sure we should have both anyway. Also it is weird since the UI gives you the local endpoint with the server span (which should always exist for a client one) so it just looks duplicated and eating network bandwidth, no?

codefromthecrypt commented 4 years ago

indexing happens on a span by span basis. traces are reported by different nodes. messages including things can have mixed order etc. most instrumentation dont have a concept of local root to know when something starts and stops. spans can be between server and client also.

the UI doesnt currently guess what the endpoints should be when missing but even if it did it would have to consistently guess.. the input data would need to be correct.

I can understand you being curious just that this load generator is not representative of correct data uses. also we have a lot of work to do and it is volunteer run project. I hope these notes I have given can help you but I am a bit over budget on this topic. good luck!

On Fri, Jun 19, 2020, 6:07 PM Romain Manni-Bucau notifications@github.com wrote:

@adriancole https://github.com/adriancole the generator is a custom tracer but having local and remote endpoints at the same time is weird and created issues at some points so not sure we should have both anyway. Also it is weird since the UI gives you the local endpoint with the server span (which should always exist for a client one) so it just looks duplicated and eating network bandwidth, no?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openzipkin/zipkin-support/issues/16#issuecomment-646552869, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPVV5DN5DS43YCS3WOHW3RXM2F3ANCNFSM4OCSHXBA .

rmannibucau commented 4 years ago

@adriancole this is actually not related to endpoints. Indexing happens based on id as in most solution of that category (parent id, tx id) so endpoints are purely metadata (annotations), just promoted as more important than others. In such context, requiring local endpoint on both server and client is very inconsistent and should be fixed anyway IMHO. I can see a client span local endpoint being used while its parent server span is not there but it is a very fragile asumption in the case of a cluster so sticking to id is the only relevant way to show understable data. Endpoints are really only meaningful as metadata and should be represented as such.

codefromthecrypt commented 4 years ago

service name is a primary index field. it has been since 2012. we have to render the existing community and properly written tools. we have service aggregations all over zipkin. when people implement incorrectly as sometimes happen we try to steer them in the right place. aggregation requires a stable field and a stable meaning of it. if we randomly broke this meaning due to a conversation about the confused data you have pasted so far it would be a disservice to thousands of people.

you can feel free to fork since you seem to think you know better. you might run into similar problems though and when you do, I hope notes we have made help

rmannibucau commented 4 years ago

Hmm, this cant break the community since it enhances current support, works in grafana and works in several proprietary systems so I think your last statement emphasis a bit too much history - and yes we alread had this kind of impl at least 3 times and it works well. What would be ok too is to hide unknown values, would be saner than showing that it is unknown when field is not fully required.

codefromthecrypt commented 4 years ago

The reason unknown is set is because most people except you would see this as a problem. It is historical, but it is also current. Also, you are asking about our UI and wanting us to change it. I don't think you would be able to assess what most zipkin users need better than a primary maintainer of the project for the last 5 years. Please correct your data or agree not to and stop bickering as it will lower motivations to help you when things are unnecessarily arduous.