Closed miyakz1192 closed 4 years ago
I believe this might have been fixed already. Here's what I see when running Jaeger from the current tip of the master branch:
Your reproducer:
$ ruby -I . sample.rb
Traceback (most recent call last):
7: from sample.rb:13:in `<main>'
6: from sample.rb:10:in `test'
5: from sample.rb:10:in `map'
4: from sample.rb:10:in `each'
3: from /usr/local/share/gems/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:337:in `each_remote_read_then_finish'
2: from /usr/local/share/gems/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:169:in `receive_and_check_status'
1: from /usr/local/share/gems/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:180:in `attach_status_results_and_complete_call'
/usr/local/share/gems/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:29:in `check_status': 5:trace not found: trace not found. debug_error_string:{"created":"@1595834670.748965007","description":"Error received from peer ipv6:[::1]:16686","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"trace not found: trace not found","grpc_status":5} (GRPC::NotFound)
Jaeger Query:
{"level":"error","ts":1595834670.7487974,"caller":"app/grpc_handler.go:59","msg":"trace not found","error":"trace not found","stacktrace":"github.com/jaegertracing/jaeger/cmd/query/app.(*GRPCHandler).GetTrace\n\t/home/jpkroehling/Projects/src/github.com/jaegertracing/jaeger/cmd/query/app/grpc_handler.go:59\ngithub.com/jaegertracing/jaeger/proto-gen/api_v2._QueryService_GetTrace_Handler\n\t/home/jpkroehling/Projects/src/github.com/jaegertracing/jaeger/proto-gen/api_v2/query.pb.go:955\ngoogle.golang.org/grpc.(*Server).processStreamingRPC\n\t/home/jpkroehling/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:1237\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/home/jpkroehling/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:1317\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/home/jpkroehling/go/pkg/mod/google.golang.org/grpc@v1.27.1/server.go:722"}
My Ruby versions, installed fresh on an up to date Fedora 32:
$ ruby -v
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
In fact, I wasn't able to reproduce it even on an ancient Jaeger version (1.12), which makes me wonder the actual version you are using. Would you be able to get the container SHA that you are running? Or perhaps could you please try setting a fixed, recent version (1.18.1, for instance)?
@jkandasa just pointed out that there's a SHA in your original report, so I went ahead and tried to reproduce it without success:
$ podman run -p 16686 jaegertracing/all-in-one@sha256:13646d75e8c33ab1c9e12a5c7269e3fe4f7b122a9f79d82d2b6f9cf25aad2d98
{"level":"info","ts":1595835573.0093844,"caller":"flags/service.go:116","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1595835573.0098934,"caller":"flags/admin.go:120","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1595835573.0099263,"caller":"flags/admin.go:126","msg":"Starting admin HTTP server","http-addr":":14269"}
{"level":"info","ts":1595835573.0099432,"caller":"flags/admin.go:112","msg":"Admin server started","http.host-port":"[::]:14269","health-status":"unavailable"}
{"level":"info","ts":1595835573.0429375,"caller":"memory/factory.go:61","msg":"Memory storage initialized","configuration":{"MaxTraces":0}}
{"level":"info","ts":1595835573.05047,"caller":"server/grpc.go:76","msg":"Starting jaeger-collector gRPC server","grpc.host-port":":14250"}
{"level":"info","ts":1595835573.0504975,"caller":"server/http.go:44","msg":"Starting jaeger-collector HTTP server","http host-port":":14268"}
{"level":"info","ts":1595835573.0506194,"caller":"server/zipkin.go:48","msg":"Listening for Zipkin HTTP traffic","zipkin host-port":":0"}
{"level":"info","ts":1595835573.0677428,"caller":"grpc/builder.go:66","msg":"Agent requested insecure grpc connection to collector(s)"}
{"level":"info","ts":1595835573.067788,"caller":"grpc@v1.27.1/clientconn.go:106","msg":"parsed scheme: \"\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1595835573.068015,"caller":"grpc@v1.27.1/clientconn.go:106","msg":"scheme \"\" not registered, fallback to default scheme","system":"grpc","grpc_log":true}
{"level":"info","ts":1595835573.0680413,"caller":"passthrough/passthrough.go:48","msg":"ccResolverWrapper: sending update to cc: {[{:14250 <nil> 0 <nil>}] <nil> <nil>}","system":"grpc","grpc_log":true}
{"level":"info","ts":1595835573.06813,"caller":"grpc@v1.27.1/clientconn.go:948","msg":"ClientConn switching balancer to \"round_robin\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1595835573.0688906,"caller":"command-line-arguments/main.go:206","msg":"Starting agent"}
{"level":"info","ts":1595835573.0688972,"caller":"base/balancer.go:196","msg":"roundrobinPicker: newPicker called with info: {map[0xc00020bc30:{{:14250 <nil> 0 <nil>}}]}","system":"grpc","grpc_log":true}
{"level":"info","ts":1595835573.0689435,"caller":"querysvc/query_service.go:137","msg":"Archive storage not created","reason":"archive storage not supported"}
{"level":"info","ts":1595835573.0689585,"caller":"app/flags.go:113","msg":"Archive storage not initialized"}
{"level":"info","ts":1595835573.0689614,"caller":"app/agent.go:69","msg":"Starting jaeger-agent HTTP server","http-port":5778}
{"level":"info","ts":1595835573.0693355,"caller":"app/server.go:133","msg":"Query server started","port":16686,"addr":":16686"}
{"level":"info","ts":1595835573.06936,"caller":"healthcheck/handler.go:128","msg":"Health Check state change","status":"ready"}
{"level":"info","ts":1595835573.0693824,"caller":"app/server.go:171","msg":"Starting CMUX server","port":16686,"addr":":16686"}
{"level":"info","ts":1595835573.069387,"caller":"app/server.go:161","msg":"Starting GRPC server","port":16686,"addr":":16686"}
{"level":"info","ts":1595835573.0693805,"caller":"app/server.go:148","msg":"Starting HTTP server","port":16686,"addr":":16686"}
{"level":"error","ts":1595835575.6054332,"caller":"app/grpc_handler.go:59","msg":"trace not found","error":"trace not found","stacktrace":"github.com/jaegertracing/jaeger/cmd/query/app.(*GRPCHandler).GetTrace\n\tgithub.com/jaegertracing/jaeger/cmd/query/app/grpc_handler.go:59\ngithub.com/jaegertracing/jaeger/proto-gen/api_v2._QueryService_GetTrace_Handler\n\tgithub.com/jaegertracing/jaeger/proto-gen/api_v2/query.pb.go:955\ngoogle.golang.org/grpc.(*Server).processStreamingRPC\n\tgoogle.golang.org/grpc@v1.27.1/server.go:1237\ngoogle.golang.org/grpc.(*Server).handleStream\n\tgoogle.golang.org/grpc@v1.27.1/server.go:1317\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\tgoogle.golang.org/grpc@v1.27.1/server.go:722"}
Thank you for your reply and sorry for my late reply. I tried image(sha256:13646d75e8c33ab1c9e12a5c7269e3fe4f7b122a9f79d82d2b6f9cf25aad2d98) but situation was not changed.
[jaeger side logs]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x10e7586]
goroutine 288 [running]:
github.com/jaegertracing/jaeger/cmd/query/app.(*GRPCHandler).GetTrace(0xc005fb87e0, 0xc005abab10, 0x1c94a80, 0xc000118b60, 0xc005fb87e0, 0x20)
github.com/jaegertracing/jaeger/cmd/query/app/grpc_handler.go:66 +0x3d6
github.com/jaegertracing/jaeger/proto-gen/api_v2._QueryService_GetTrace_Handler(0x1336600, 0xc005fb87e0, 0x1c91c60, 0xc000370240, 0x26995e8, 0xc0002
5cf00)
github.com/jaegertracing/jaeger/proto-gen/api_v2/query.pb.go:955 +0x109
google.golang.org/grpc.(*Server).processStreamingRPC(0xc005fb2300, 0x1c988c0, 0xc0006dd080, 0xc00025cf00, 0xc005fbacc0, 0x2653f20, 0x0, 0x0, 0x0)
google.golang.org/grpc@v1.27.1/server.go:1237 +0xcd1
google.golang.org/grpc.(*Server).handleStream(0xc005fb2300, 0x1c988c0, 0xc0006dd080, 0xc00025cf00, 0x0)
google.golang.org/grpc@v1.27.1/server.go:1317 +0xcd6
google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0002880c0, 0xc005fb2300, 0x1c988c0, 0xc0006dd080, 0xc00025cf00)
google.golang.org/grpc@v1.27.1/server.go:722 +0xa1
created by google.golang.org/grpc.(*Server).serveStreams.func1
google.golang.org/grpc@v1.27.1/server.go:720 +0xa1
root@kubecon1:/home/miyakz#
[client side output]
root@eng-app-app-57cf6f7bb9-2tq66:/jaeger-idl/proto-gen-ruby# ruby -I . sample.rb
Traceback (most recent call last):
7: from sample.rb:13:in `<main>'
6: from sample.rb:10:in `test'
5: from sample.rb:10:in `map'
4: from sample.rb:10:in `each'
3: from /var/lib/gems/2.5.0/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:337:in `each_remote_read_then_finish'
2: from /var/lib/gems/2.5.0/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:169:in `receive_and_check_status'
1: from /var/lib/gems/2.5.0/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:180:in `attach_status_results_and_complete_call'
/var/lib/gems/2.5.0/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:29:in `check_status': 14:Socket closed. debug_error_string:{"created":"@1597501484.472667844","description":"Error received from peer ipv4:192.168.100.2:30858","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"Socket closed","grpc_status":14} (GRPC::Unavailable)
root@eng-app-app-57cf6f7bb9-2tq66:/jaeger-idl/proto-gen-ruby#
I have two worker nodes.
root@kubecon1:/home/miyakz# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubecon1 Ready master 308d v1.16.1
kubenode11 Ready <none> 308d v1.16.1
root@kubecon1:/home/miyakz#
And jaeger images on each worker nodes are following.
root@kubecon1:/home/miyakz# docker image list | grep jaeger
jaegertracing/protobuf 0.2.0 8817b24a4c9e 7 weeks ago 153MB
jaegertracing/thrift 0.13 46013df3cf70 3 months ago 126MB
root@kubecon1:/home/miyakz#
root@kubenode11:/home/miyakz# docker image list | grep jaeger
jaegertracing/all-in-one <none> 5ef5c2d8e8d1 5 weeks ago 47.4MB
root@kubenode11:/home/miyakz#
root@kubenode11:/home/miyakz# docker inspect --format='{{.RepoDigests}}' 5ef5c2d8e8d1
[jaegertracing/all-in-one@sha256:13646d75e8c33ab1c9e12a5c7269e3fe4f7b122a9f79d82d2b6f9cf25aad2d98]
root@kubenode11:/home/miyakz#
Needless to say, jaeger tracing container runs on kubenode11.
istio-system istio-tracing-6dd48c455-2snw5 1/1 Running 1 19m 10.244.1.119 kubenode11 <none> <none>
My Intuition is CPU misconfiguration or bug. My CPU environment is the following.
root@kubenode11:/home/miyakz# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 40 bits physical, 48 bits virtual
CPU(s): 7
On-line CPU(s) list: 0-6
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 7
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 6
Model: 6
Model name: QEMU Virtual CPU version 2.5+
Stepping: 3
CPU MHz: 3593.254
BogoMIPS: 7186.50
Virtualization: AMD-V
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
L3 cache: 16384K
NUMA node0 CPU(s): 0-6
Flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm nopl cpuid pni cx16 hypervisor lahf_lm svm 3dnowprefetch vmmcall
root@kubenode11:/home/miyakz#
Could you any comments?
Additional information.
I found that jaeger-all-in-one works on VM doesn't crash.
I pulled jaeger source from github.
root@dockerbuild:/home/miyakz/jaeger/cmd/all-in-one# git remote -v
origin https://github.com/jaegertracing/jaeger.git (fetch)
origin https://github.com/jaegertracing/jaeger.git (push)
root@dockerbuild:/home/miyakz/jaeger/cmd/all-in-one# git log | head
commit 5f4fc8da4d4e5d80cff7f8e06bad9c8e271abcf1
Author: Sreevani K V S <sreevani@freshdesk.com>
Date: Fri Aug 14 22:12:18 2020 +0530
Fix panic in collector when Zipkin server is shutdown (#2392)
Signed-off-by: Sreevani871 <sreevani@freshdesk.com>
commit 5daae75911e36b842b45279d8a599bcc5d56eabf
Author: rjs211 <srivatsa211@gmail.com>
root@dockerbuild:/home/miyakz/jaeger/cmd/all-in-one#
and I built all-in-one binary and try to reproduce the problem but the problem didn't appear.
[server side log]
{"level":"error","ts":1597551056.6664784,"caller":"app/grpc_handler.go:59","msg":"trace not found","error":"trace not found","stacktrace":"github.com/jaegertracing/jaeger/cmd/query/app.(*GRPCHandler).GetTrace\n\tgithub.com/jaegertracing/jaeger/cmd/query/app/grpc_handler.go:59\ngithub.com/jaegertracing/jaeger/proto-gen/api_v2._QueryService_GetTrace_Handler\n\tgithub.com/jaegertracing/jaeger/proto-gen/api_v2/query.pb.go:955\ngoogle.golang.org/grpc.(*Server).processStreamingRPC\n\tgoogle.golang.org/grpc@v1.31.0/server.go:1427\ngoogle.golang.org/grpc.(*Server).handleStream\n\tgoogle.golang.org/grpc@v1.31.0/server.go:1507\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\tgoogle.golang.org/grpc@v1.31.0/server.go:843"}
[client side log]
root@dockerbuild:/home/miyakz/jaeger-idl/proto-gen-ruby# ruby -I . sample.rb
Traceback (most recent call last):
7: from sample.rb:15:in `<main>'
6: from sample.rb:12:in `test'
5: from sample.rb:12:in `map'
4: from sample.rb:12:in `each'
3: from /var/lib/gems/2.5.0/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:337:in `each_remote_read_then_finish'
2: from /var/lib/gems/2.5.0/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:169:in `receive_and_check_status'
1: from /var/lib/gems/2.5.0/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:180:in `attach_status_results_and_complete_call'
/var/lib/gems/2.5.0/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:29:in `check_status': 5:trace not found: trace not found. debug_error_string:{"created":"@1597551056.685024015","description":"Error received from peer ipv4:127.0.0.1:16686","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"trace not found: trace not found","grpc_status":5} (GRPC::NotFound)
root@dockerbuild:/home/miyakz/jaeger-idl/proto-gen-ruby#
I think there is something wrong with container environment. anyone knows a similar problem?
Additional information.
Just in case, I created jaeger all-in-one with scripts/travis/build-all-in-one-image.sh and push to my dockerhub repo(https://hub.docker.com/repository/docker/miyakz1192/jaeger-all-in-one)
the container created from that image crashed with SIGSEGV.
[note1] when I exec build-all-in-one-image.sh, I set BRANCH environment valuable with "master".
[note2] I run cmd/all-in-one/all-in-one-linux-amd64 in VM, this program(process) didn't crash.
Would you have detailed instructions on how to reproduce this? Based on your last messages, I assumed that it would happen with Jaeger deployed on Kubernetes, so, I started minikube, deployed the Jaeger Operator, deployed a Jaeger instance, did a port-forward for the 16686 port, but still couldn't hit the problem:
$ minikube start --vm-driver kvm2 --cpus 6 --memory 20480 --container-runtime=crio
$ kubectl create namespace observability
$ kubectl apply -f deploy/crds/jaegertracing.io_jaegers_crd.yaml
$ kubectl apply -f deploy/service_account.yaml
$ kubectl apply -f deploy/role.yaml
$ kubectl apply -f deploy/role_binding.yaml
$ kubectl apply -f deploy/operator.yaml
$ kubectl apply -f deploy/examples/simplest.yaml
$ kubectl port-forward service/simplest-query 16686
Your reproducer:
$ ruby -I . sample.rb
Traceback (most recent call last):
7: from sample.rb:13:in `<main>'
6: from sample.rb:10:in `test'
5: from sample.rb:10:in `map'
4: from sample.rb:10:in `each'
3: from /home/jpkroehling/.gem/ruby/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:337:in `each_remote_read_then_finish'
2: from /home/jpkroehling/.gem/ruby/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:169:in `receive_and_check_status'
1: from /home/jpkroehling/.gem/ruby/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:180:in `attach_status_results_and_complete_call'
/home/jpkroehling/.gem/ruby/gems/grpc-1.30.2-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:29:in `check_status': 5:trace not found: trace not found. debug_error_string:{"created":"@1597658073.663623997","description":"Error received from peer ipv6:[::1]:16686","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"trace not found: trace not found","grpc_status":5} (GRPC::NotFound)
The Jaeger logs:
{"level":"error","ts":1597658068.1243339,"caller":"app/grpc_handler.go:59","msg":"trace not found","error":"trace not found","stacktrace":"github.com/jaegertracing/jaeger/cmd/query/app.(*GRPCHandler).GetTrace\n\tgithub.com/jaegertracing/jaeger/cmd/query/app/grpc_handler.go:59\ngithub.com/jaegertracing/jaeger/proto-gen/api_v2._QueryService_GetTrace_Handler\n\tgithub.com/jaegertracing/jaeger/proto-gen/api_v2/query.pb.go:955\ngoogle.golang.org/grpc.(*Server).processStreamingRPC\n\tgoogle.golang.org/grpc@v1.27.1/server.go:1237\ngoogle.golang.org/grpc.(*Server).handleStream\n\tgoogle.golang.org/grpc@v1.27.1/server.go:1317\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\tgoogle.golang.org/grpc@v1.27.1/server.go:722"}
hi. thank you for your reply.
My vmhost has not been configured with KVM enabled, I enabled that. But after this, any Virtual Machines would not boot. Maybe my environment was destroyed. So, I will rebuild my vmhost with 20.04 LTS and report later.
Do let us know. At this point, this is a mystery to me :-)
Thank you for your response. I try to re-install ubuntu as 20.04LTS and ran minikube with your instruction.
I didn't encounter odd symptoms.
miyakz@lily2:~/source/jaeger-idl/proto-gen-ruby$ ruby -I . sample.rb
Traceback (most recent call last):
7: from sample.rb:13:in `<main>'
6: from sample.rb:10:in `test'
5: from sample.rb:10:in `map'
4: from sample.rb:10:in `each'
3: from /var/lib/gems/2.7.0/gems/grpc-1.31.1-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:337:in `each_remote_read_then_finish'
2: from /var/lib/gems/2.7.0/gems/grpc-1.31.1-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:169:in `receive_and_check_status'
1: from /var/lib/gems/2.7.0/gems/grpc-1.31.1-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:180:in `attach_status_results_and_complete_call'
/var/lib/gems/2.7.0/gems/grpc-1.31.1-x86_64-linux/src/ruby/lib/grpc/generic/active_call.rb:29:in `check_status': 5:trace not found: trace not found. debug_error_string:{"created":"@1598681485.188116118","description":"Error received from peer ipv4:127.0.0.1:16686","file":"src/core/lib/surface/call.cc","file_line":1062,"grpc_message":"trace not found: trace not found","grpc_status":5} (GRPC::NotFound)
miyakz@lily2:~/source/jaeger-idl/proto-gen-ruby$
I think my environment had something wrong with KVM/qemu. After re-install it seems to everything fine.
Feel free to reopen if needed.
Describe the bug all-in-one jaeger crashes with SIGSEGV
To Reproduce Steps to reproduce the behavior: 1.set up ruby's gRPC environment 1-1) install ruby (my environment is "ruby 2.5.5p157 (2019-03-15 revision 67260) [x86_64-linux-gnu]") 1-2) installing related gem(for gRPC)
2.check jaeger's logfile execute in controller node.
3.execute this code(do this procedure)
4.you will get panic error log like that
Expected behavior Jaeger will return trace data to ruby client.
Screenshots No screenshots, because of above reproduction procedure is.
Version (please complete the following information):
What troubleshooting steps did you try? Try to follow https://www.jaegertracing.io/docs/latest/troubleshooting/ and describe how far you were able to progress and/or which steps did not work. I read those document but I couldn't found solution.