pinpoint-apm / pinpoint

APM, (Application Performance Management) tool for large-scale distributed systems.
https://pinpoint-apm.gitbook.io/
Apache License 2.0
13.29k stars 3.75k forks source link

Can v2.3.3 & v3.0.x use the same hbase environment ? #11133

Open toushi1st opened 4 weeks ago

toushi1st commented 4 weeks ago

Hello, I currently use v2.3.3 and Hbase 2.4.10, hope to smoothly upgrade to v3.0.x, and try to reuse the existing server. I have seen some differences in hbase-create.hbase between pinpoint v2.3.3 and v3.0.0. Is v2.3.3 compatible with the table structure of v3.0.0? Can I modify the current table structure as v3.0.0 so that both versions can use the same hbase environment? Thank you.

ga-ram commented 4 weeks ago

It is recommended to use v3.0.0 hbase schema with v3.0.0 collector and web (supports hbase 2.4.10). Still it seems like v2.3.3 agent is compatible with v3.0.0 collector(+v3.0.0 hbase schema) for basic functionalities.

toushi1st commented 4 weeks ago

Thank you for your reply. I would like to temporarily keep v2.3.3 collector running while running the 3.3.x collector. This will give us enough time to gradually upgrade the v2.3.3 agents or connect the 2.3.3 agents directly to the v3.3.0 collector, which may involve restarting production applications and also requires time. And if the two collectors can run in the same HBase environment, it can save HBase servers and transaction tracking between different version agents during the upgrade process will not be affected. So, I would like to know if all the components of v2.3.3, especially collector, are compatible with the table structure of v3.3.0?

Additionally, I have a question related to the upgrade. If we modify the IP (or load balancer VIP) of the v3.3.0 collector to match that of the v2.3.3 collector while the applications with the v2.3.3 agent deployed are currently running. Will the connection between the v2.3.3 agent and the new collector be normal? Because restarting a lot of production applications requires time and carries a certain risk. If a seamless switch is possible, that would be the best. I encountered errors in the v2.5.x test.

ga-ram commented 4 weeks ago

Changes applied in hbase schema during v2.3.3 -> v3.0.0 is backward compatible so it would be fine to gradually deploy v3.0.0 collector while keeping some v2.3.3 collectors connected to altered hbase.

Given that all v2.3.3 collectors and v3.0.0 collectors are connected to the same load balancer, and you gradually update each pinpoint collector versions, your applications with v2.3.3 pinpoint agent will be reconnected to the remaining v3.0.0 collector when all your v2.3.3 collectors are shut down.

Will you explain further what errors you have encountered with v2.5.x?

toushi1st commented 3 weeks ago

Thank you for your reply, this answer is helpful. The test of changing the collector IP for version switching occurred earlier, but I had retained some error logs of agent from that time. The new version of collector used was v2.5.1, and all v2.3.3 agents connect to a VIP that pointed to three v2.3.3 collector servers. The switching process was to change the server that this VIP pointed to to two v2.5.1 collectors. After the switchover, the v2.3.3 agent error logs were as follows. No new monitoring data was generated on the web ui. After the VIP configuration was restored, the connection also returned to normal.

------------------error logs ------------ 05-12 15:45:10.010 [el-Worker(14-0)] DEBUG i.n.h.c.h.Http2ConnectionHandler -- [id: 0x9a9d34dd, L:/10.100.24.42:59857 ! R:10.100.40.214/10.100.40.214:9992] Sending GOAWAY failed: lastStreamId '2147483647', errorCode '2', debugData 'readAddress(..) failed: Connection reset by peer'. Forcing shutdown of the connection. io.netty.channel.unix.Errors$NativeIoException: writevAddresses(..) failed: 断开的管道 (note: disconnected pipes) 05-12 15:45:10.010 [-Executor(15-0)] INFO c.n.p.p.s.g.ResponseStreamObserver -- Failed to stream, name=DefaultStreamEventListener{StatStream-3}, cause=UNAVAILABLE: io exception io.grpc.StatusRuntimeException: UNAVAILABLE: io exception at io.grpc.Status.asRuntimeException(Status.java:535) ~[grpc-api-1.36.2.jar:1.36.2] at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:478) [grpc-stub-1.36.2.jar:1.36.2] at io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:464) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:428) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:461) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [grpc-core-1.36.2.jar:2.3.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_152] at java.lang.Thread.run(Thread.java:748) [?:1.8.0152] Caused by: io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer 05-12 15:45:10.010 [-Executor(16-0)] INFO c.n.p.p.s.g.s.DefaultStreamTask -- dispatch thread end status:INTERRUPTED name='StatStream-3 05-12 15:45:11.011 [ect-thread(4-0)] INFO c.n.p.p.s.g.StatGrpcDataSender -- newStream StatStream 05-12 15:45:11.011 [ect-thread(4-0)] INFO c.n.p.p.s.g.ResponseStreamObserver -- beforeStart DefaultStreamEventListener{StatStream-4} 05-12 15:45:11.011 [-Executor(15-0)] INFO c.n.p.p.s.g.StatGrpcDataSender -- ConnectivityState changed before:IDLE, change:CONNECTING 05-12 15:45:11.011 [-Executor(15-0)] INFO c.n.p.p.s.g.StatGrpcDataSender -- ConnectivityState changed before:CONNECTING, change:READY 05-12 15:45:11.011 [-Executor(15-0)] INFO c.n.p.p.s.g.ResponseStreamObserver -- onReadyHandler DefaultStreamEventListener{StatStream-4} isReadyCount:1 05-12 15:45:11.011 [-Executor(15-0)] INFO c.n.p.p.s.g.s.StreamExecutor -- stream execute name='StatStream-4 05-12 15:45:11.011 [-Executor(16-0)] INFO c.n.p.p.s.g.s.DefaultStreamTask -- dispatch start name='StatStream-4 05-12 15:45:13.013 [-Executor(12-0)] INFO c.n.p.p.s.g.AgentGrpcDataSender -- ConnectivityState changed before:READY, change:IDLE 05-12 15:45:13.013 [el-Worker(11-0)] DEBUG i.n.h.c.h.Http2ConnectionHandler -- [id: 0x9f0e8573, L:/10.100.24.42:61355 ! R:10.100.40.214/10.100.40.214:9991] Sending GOAWAY failed: lastStreamId '2147483647', errorCode '2', debugData 'readAddress(..) failed: Connection reset by peer'. Forcing shutdown of the connection. io.netty.channel.unix.Errors$NativeIoException: writevAddresses(..) failed: 断开的管道 (note: disconnected pipes)_ 05-12 15:45:13.013 [-Executor(12-0)] WARN c.n.p.p.r.g.GrpcCommandService -- Failed to command stream, cause=UNAVAILABLE: io exception io.grpc.StatusRuntimeException: UNAVAILABLE: io exception at io.grpc.Status.asRuntimeException(Status.java:535) ~[grpc-api-1.36.2.jar:1.36.2] at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:478) [grpc-stub-1.36.2.jar:1.36.2] at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [grpc-core-1.36.2.jar:2.3.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_152] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_152] Caused by: io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer 05-12 15:45:13.013 [-Executor(12-0)] INFO c.n.p.p.s.g.PingStreamContext -- Failed to ping stream, streamId=PingStream-2, cause=UNAVAILABLE: io exception io.grpc.StatusRuntimeException: UNAVAILABLE: io exception at io.grpc.Status.asRuntimeException(Status.java:535) ~[grpc-api-1.36.2.jar:1.36.2] at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:478) [grpc-stub-1.36.2.jar:1.36.2] at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [grpc-core-1.36.2.jar:2.3.3] at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [grpc-core-1.36.2.jar:2.3.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_152] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_152] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_152] Caused by: io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection reset by peer