energywebfoundation / volta-system-tests

GNU General Public License v3.0
1 stars 2 forks source link

Signer freeze at block #668484 #1

Open mclrch opened 5 years ago

mclrch commented 5 years ago

After the block #668484 all signer instances stopped send data to telemetry, fixed with a restart of container.

validator-00

2019-06-08T04:24:00.785611746Z Gas Used: 0 2019-06-08T04:24:00.785613809Z Peers: 13 2019-06-08T04:24:00.987309616Z Real Time Telemetry Block data sent to Ingress Block # 668484 2019-06-08T04:24:10.338068706Z Flushing 50 to ingress. 27 still in queue. 2019-06-08T04:24:20.338294015Z Flushing 27 to ingress. 0 still in queue. 2019-06-08T04:24:30.336518054Z Not flushing: 0 Queued - 9.9698306 seconds since flush 2019-06-08T04:24:40.336685983Z Flushing 50 to ingress. 81 still in queue. 2019-06-08T04:24:50.337487580Z Flushing 50 to ingress. 31 still in queue. 2019-06-08T04:25:00.338872027Z Flushing 31 to ingress. 0 still in queue. 2019-06-08T04:25:10.337834770Z Flushing 50 to ingress. 27 still in queue. 2019-06-08T04:25:20.336953884Z Flushing 27 to ingress. 0 still in queue. 2019-06-08T04:25:30.338274500Z Flushing 50 to ingress. 54 still in queue.

validator-01

signer_1 | 2019-06-08T04:24:00.420771353Z Gas Used: 0 signer_1 | 2019-06-08T04:24:00.420775121Z Peers: 13 signer_1 | 2019-06-08T04:24:00.775853779Z Real Time Telemetry Block data sent to Ingress Block # 668484 signer_1 | 2019-06-08T04:24:01.406991716Z Flushing 50 to ingress. 54 still in queue. signer_1 | 2019-06-08T04:24:11.406948025Z Flushing 50 to ingress. 4 still in queue.

danzipie commented 5 years ago

That block differs from the others because it contains many more transactions than usual: https://explorer.energyweb.org/blocks/668485/transactions

elasticroentgen commented 5 years ago

parity was still alive and signing blocks?

mclrch commented 5 years ago

yes

elasticroentgen commented 5 years ago

wierd. question is how we tackle this. for the prod chain signer is obsolete and i suppose you will switch volta to the same setup in the future.

From the log files there is nothing that shows an exception or something else.

danzipie commented 5 years ago

The signer is not obsolete for Volta chain and we need it to work even when blocks are not empty.

elasticroentgen commented 5 years ago

it was confirmed working even with non-empty blocks. i can see if it is reproducible with the limited amount of information. any logs on the ingress maybe that suggest anything?

danzipie commented 5 years ago

There was a unhandled nodecontrol exception approx. 2000 blocks before:

[32mnodecontrol_1  | 
signer_1       | Tx Count: 0
nodecontrol_1  | Unhandled Exception: System.AggregateException: One or more errors occurred. (Connection refused) ---> System.Net.Http.HttpRequestException: Connection refused ---> System.Net.Sockets.SocketException: Connection refused
signer_1       | Gas Limit: 8000000
nodecontrol_1  |    at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
signer_1       | Gas Used: 0
nodecontrol_1  |    --- End of inner exception stack trace ---
signer_1       | Peers: 34 
nodecontrol_1  |    at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
signer_1       | Real Time Telemetry Block data sent to Ingress Block # 664726
nodecontrol_1  |    at System.Threading.Tasks.ValueTask`1.get_Result()
signer_1       | Not flushing: 6 Queued - 9.7702269 seconds since flush
nodecontrol_1  |    at System.Net.Http.HttpConnectionPool.CreateConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
signer_1       | New Block received
nodecontrol_1  |    at System.Threading.Tasks.ValueTask`1.get_Result()
signer_1       | block num: 664727
nodecontrol_1  |    at System.Net.Http.HttpConnectionPool.WaitForCreatedConnectionAsync(ValueTask`1 creationTask)
signer_1       | block hash: 0xb30bbf3a0d4f52b9e2e0a3ab4ff2d837cdf251835006570c6b79a2c7daaa3fa5
nodecontrol_1  |    at System.Threading.Tasks.ValueTask`1.get_Result()
signer_1       | block time stamp: 1559949055
nodecontrol_1  |    at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
signer_1       | Tx Count: 0
nodecontrol_1  |    at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
signer_1       | Gas Limit: 8000000
signer_1       | Gas Used: 0
nodecontrol_1  |    at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
signer_1       | Peers: 34 
nodecontrol_1  |    --- End of inner exception stack trace ---
signer_1       | Real Time Telemetry Block data sent to Ingress Block # 664727
nodecontrol_1  |    at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
signer_1       | New Block received
nodecontrol_1  |    at src.Program.Main(String[] args) in /builds/ewf/project-equestria/nodecontrol/src/Program.cs:line 45
signer_1       | block num: 664728
nodecontrol_1  | EWF NodeControl
signer_1       | block hash: 0x15c2a444e9a94a309399103799742db4940e7312fb8c50ee2bdb083fd6c8ff65
nodecontrol_1  | 
signer_1       | block time stamp: 1559949060
nodecontrol_1  | Unhandled Exception: System.AggregateException: One or more errors occurred. (Connection refused) ---> System.Net.Http.HttpRequestException: Connection refused ---> System.Net.Sockets.SocketException: Connection refused
signer_1       | Tx Count: 0
nodecontrol_1  |    at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
signer_1       | Gas Limit: 8000000
nodecontrol_1  |    --- End of inner exception stack trace ---
signer_1       | Gas Used: 0
nodecontrol_1  |    at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
signer_1       | Peers: 34 
nodecontrol_1  |    at System.Threading.Tasks.ValueTask`1.get_Result()
signer_1       | Real Time Telemetry Block data sent to Ingress Block # 664728
nodecontrol_1  |    at System.Net.Http.HttpConnectionPool.CreateConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
signer_1       | Flushing 50 to ingress. 85 still in queue.
nodecontrol_1  |    at System.Threading.Tasks.ValueTask`1.get_Result()
signer_1       | New Block received
nodecontrol_1  |    at System.Net.Http.HttpConnectionPool.WaitForCreatedConnectionAsync(ValueTask`1 creationTask)
signer_1       | block num: 664729
nodecontrol_1  |    at System.Threading.Tasks.ValueTask`1.get_Result()
signer_1       | block hash: 0xade0a44d0aa00c60f985e644f4f6e1d24dd0dbf3a20b0f433a656e9478aaab48
nodecontrol_1  |    at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
signer_1       | block time stamp: 1559949065
nodecontrol_1  |    at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
signer_1       | Tx Count: 0
nodecontrol_1  |    at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
signer_1       | Gas Limit: 7992189
nodecontrol_1  |    --- End of inner exception stack trace ---
signer_1       | Gas Used: 0
nodecontrol_1  |    at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
signer_1       | Peers: 34 
nodecontrol_1  |    at src.Program.Main(String[] args) in /builds/ewf/project-equestria/nodecontrol/src/Program.cs:line 45
signer_1       | Real Time Telemetry Block data sent to Ingress Block # 664729
elasticroentgen commented 5 years ago

so it seems nodecontrol can't connect to parity anymore. I'll check the code if there is anything that doesn't correctly free resources