apache / skywalking

APM, Application Performance Monitoring System
https://skywalking.apache.org/
Apache License 2.0
23.86k stars 6.52k forks source link

[Agent] can not connect to the collector maybe cause memory leak #2190

Closed xuet0ng closed 5 years ago

xuet0ng commented 5 years ago

Please answer these questions before submitting your issue.


Bug

wu-sheng commented 5 years ago

Could you try that, shutdown the old one before creating a new one. Could that solve this leak?

Look forward your feedback and update.

xuet0ng commented 5 years ago
xuet0ng commented 5 years ago

修改方法是...reconnect的时候,将原来channel.shutdownNow()

wu-sheng commented 5 years ago

@xuet0ng Do you want to raise a pull request to fix this? You are the one found this and found the solution. You should take the credit.

xuet0ng commented 5 years ago

@wu-sheng Thanks for the invitation, but unfortunately, sorry and I am afraid I can't do this. Not familiar with Skywalking ,the first time I began being familiar with it was the day before yesterday downloaded the code to locate the memory overflow problem . And the project in my company is too tight now, can't guarantee my energy to complete this thing.

wu-sheng commented 5 years ago

Sure. I could submit a pr to fix this. Thank you to find the issue and recheck the possible fix.

xuet0ng commented 5 years ago

by the way, tried and tested shutdownNow old channel if reconnect... 1st, start & connect the collector normally 2nd, disconnect the network and wait for 10 mins 3rd, reconnect the network, reconnect collector again and wait for 10 mins

in 2nd leaked instances increased very slowly in 3rd instances reduced to 2, I assumed that "to 1", interesting...

it works, but can not solve all the problems, but maybe it is acceptable.

wu-sheng commented 5 years ago

There may be still some uncertain scenario. I will try to improve that codes. And for agent-backend, I prefer to put a load balance between then, and payload of collector should be better, and we could leave the reconnect to grpc itself.

wu-sheng commented 5 years ago

Fixed.