oliverw / miningcore

Miningcore is a high-performance Mining Pool Software for Linux and Windows.
https://store.miningcore.pro
MIT License
725 stars 666 forks source link

Many "Last subscriber disconnected from receiver stream" messages #81

Closed rterbush closed 6 years ago

rterbush commented 6 years ago

Messages from EWBF indicating "Lost connection with server" at diff levels below 1 when applying about 5k M/h. After diff adjusts, the messages are gone or less frequent.

oliverw commented 6 years ago

This sounds like the pool is dropping connections as share processing load increases and stops dropping connections as load normalizes due to VarDiff decreasing the rate of share submissions. Would you agree?

oliverw commented 6 years ago

@rterbush By the way just just saw your email digging periodically through my Spam folder. Sorry about missing it and welcome aboard. Made you a contributor.

rterbush commented 6 years ago

@oliverw Greetings sir I've opened this first to get some feedback in case you, or someone is seeing this behavior. It's entirely possible that this server is underpowered although I am not seeing any resource issues. Also possible that my firewalling might be introducing some issue. Appears you are caring a pretty good load on poolmining.org so may be my environment.

Thanks for the commit perms. Would like to have an interactive chat at some point to understand your goals before causing any disruption. :-)

rterbush commented 6 years ago

@oliverw I continue to see these disconnects running 10kH/s on a private ZEC pool. I've increased server resources so that issue is ruled out. I see a number of reports in your support forum of this same issue. Pretty certain I am not having an DDoS issues, so must be something else going on.

I'll put a sniffer on the output and see if I can see more. Logs just showing that the miner disconnected.

oliverw commented 6 years ago

0a26d7525da87ec51807b0545ac2217642194005 included a possible fix for that among other changes.

rterbush commented 6 years ago

Will give this a try after next failure.

Just an observation, it almost appears that the vardiff is going high and the miners get a difficult share which delays their response and perhaps the pool is assuming they are gone.

I do notice that diff is always changed in increments of "maxDelta". Never less than that value.

rterbush commented 6 years ago

Two workers running against the pool.

Here is what I see in log:

[2017-11-21 15:20:58.9294] [D] [zec1] [Bitcoin Job Manager] No new blocks for 55 seconds - updating transactions & rebroadcasting work 
[2017-11-21 15:21:09.2642] [I] [zec1] [Pool] [0HL9GBM447331] VarDiff update to 8.5 
[2017-11-21 15:21:10.3179] [I] [zec1] [Pool] [0HL9GBM447330] Share accepted: D=12.5 
[2017-11-21 15:21:38.0457] [I] [zec1] [Pool] [0HL9GBM447330] Share accepted: D=12.5 
[2017-11-21 15:21:41.6449] [D] [zec1] [Pool] Persisting pool stats 
[2017-11-21 15:21:47.1067] [I] [zec1] [Pool] [0HL9GBM447331] Share accepted: D=8.5 
[2017-11-21 15:21:53.9813] [D] [zec1] [Bitcoin Job Manager] No new blocks for 55 seconds - updating transactions & rebroadcasting work 
[2017-11-21 15:21:57.2100] [I] [zec1] [Pool] [0HL9GBM447331] Share accepted: D=8.5 
[2017-11-21 15:22:02.0913] [I] [zec1] [Bitcoin Job Manager] New block 223503 detected 
[2017-11-21 15:22:08.6216] [I] [zec1] [Pool] [0HL9GBM447330] Share accepted: D=12.5 
[2017-11-21 15:22:09.6868] [I] [zec1] [Pool] [0HL9GBM447331] Share accepted: D=8.5 
[2017-11-21 15:22:22.6158] [I] [zec1] [Pool] [0HL9GBM447330] VarDiff update to 11.5 
[2017-11-21 15:22:23.5208] [I] [zec1] [Pool] [0HL9GBM447330] Share accepted: D=11.5 
[2017-11-21 15:22:27.2517] [I] [zec1] [Pool] [0HL9GBM447331] Share accepted: D=8.5 
[2017-11-21 15:22:32.6491] [I] [zec1] [Pool] [0HL9GBM447330] Share accepted: D=11.5 
[2017-11-21 15:22:32.6892] [D] [StratumClient`1] [0HL9GBM447330] Last subscriber disconnected from receiver stream 
[2017-11-21 15:22:34.3159] [I] [zec1] [Pool] [0HL9GBM447331] Share accepted: D=8.5 
[2017-11-21 15:22:37.5021] [I] [zec1] [Pool] [0HL9GBM447332] = t1VaVsr8cGM5iP6fD6Qjr2mxcfZDTvjHTaM.tincup = 71.229.215.18 
[2017-11-21 15:22:37.5519] [D] [zec1] [Pool] [0HL9GBM447332] Unsupported RPC request: {"method":"mining.extranonce.subscribe","params":[],"id":3} 
oliverw commented 6 years ago

I'm pretty sure that you'd observe changes < maxDelta if you'd raise maxDelta.

rterbush commented 6 years ago

maxDelta currently set to 1. So no fractional changes?

Looking at the tcpdump for this interaction around the disconnect, there is nothing there that is not reflected in the log. The client suddenly reinitiates a new connection.

oliverw commented 6 years ago

The change I've committed today prevents the server from silently dropping the connection in case of an exception being thrown while processing a stratum request which would the client force to reconnect. Not sure if it's related to the problem you've mentioned.

oliverw commented 6 years ago

@rterbush By the way. Is that zcash pool a production pool?

rterbush commented 6 years ago

I've just installed with your latest changes. Will report and close this if it appears to be resolved.

No, this is not a production pool. Still working through the challenges of getting there... :-)

oliverw commented 6 years ago

Ah ok. Keep me posted. Glad to help 😄

rterbush commented 6 years ago

Had another disconnect not quite 1 hour in.

A bit more info here. Disconnected worker is 0HL9GV3INOLEN. Pool stats persist running just before...

[2017-11-21 16:48:49.8032] [D] [zec1] [Pool] Persisting pool stats
[2017-11-21 16:48:50.0610] [I] [zec1] [Pool] [0HL9GV3INOLEO] Share accepted: D=9.505 
[2017-11-21 16:49:01.9328] [I] [zec1] [Pool] [0HL9GV3INOLEN] Share accepted: D=10.5 
[2017-11-21 16:49:02.3917] [I] [zec1] [Pool] [0HL9GV3INOLEN] Share accepted: D=10.5 
[2017-11-21 16:49:02.4314] [D] [StratumClient`1] [0HL9GV3INOLEN] Last subscriber disconnected from receiver stream 
[2017-11-21 16:49:12.6522] [D] [zec1] [Bitcoin Job Manager] No new blocks for 55 seconds - updating transactions & rebroadcasting work 
rterbush commented 6 years ago

@oliverw just browsing through the code that is outputing the "last subscriber disconnected" message. One observation (and caveat that I am still getting my head around the code...)

This code is always going to be hit, therefor the else at line 276 will never happen because prevIndex is set to index + 1 at line 258.

https://github.com/coinfoundry/miningcore/blob/0fc54735a77a6a7c72d803d2cd3c4e496a6f94e2/src/MiningCore/Stratum/StratumClient.cs#L264

oliverw commented 6 years ago

@rterbush Not if the received data is a line fragment that does not contain a newline.

rterbush commented 6 years ago

@oliverw while this seems to have become less frequent, I have still had 32 disconnect events over past 24 hours.

I'll see about adding some more debug in that code to see if we can get a better idea where this is happening.

oliverw commented 6 years ago

Thats inacceptable. Need to get to the bottom of this.

rterbush commented 6 years ago

@oliverw regarding your comment on that bit of code above....

If there is a newline, we will never hit that }else{ at 276 because prevIndex will never be 0, so we never set keepLease=true.

Or am I missing something?

rterbush commented 6 years ago

I've changed the title of this issue since I have confirmed that this is not related to number of connection requests. I've also turned on trace level logging to see if I can get a look at the transaction content when this happens.

rterbush commented 6 years ago

Here is packet capture from disconnect event. @oliverw let me know if you would prefer that I move this log output to gist... I've not really analyzed this yet to see if there is a clue...

[2017-11-22 16:17:40.3577] [I] [zec1] [Pool] [0HL9HNN88SLQO] Share accepted: D=7.5
[2017-11-22 16:17:53.8896] [T] [StratumClient`1] [0HL9HNN88SLQO] recv: {"id":29830,"method":"mining.submit","params":["t1VaVsr8cGM5iP6fD6Qjr2mxcfZDTvjHTaM.victor","61","04a3155a","f06fe80700000000000000000000000000000000000000000000000004","fd400500327a2e58c67720612d31ed9442a8712d71f3b50e18f72dd7ca09709d822565ca67e8ffe31df1f97c3a07638d1b8303c02473a2c26bac221b71005f7904ec2a13a6e2116cbd0b76d7e8af1949948b17473c3cf60f88e91400654c199ae9060165ec99319492b6b3cb217dead0c4ecdb03ae159574d05991128488dbd5d3240cf5656bd3b4e30d0b72a5bdca1a213c297cab7c4f368e60d42d150b7381d86c3ffce816f897193ee100f78774b02a2191a172318e7c38acc2bb539b27641a86992537da56e2d82652017468d0edfde77065d613ea94997c96f793e8f37914e3d8720ef66cde380227e3d393ac3b5a1dedd792b7767b68b0c43a9945de11df2e0192e8b28bd97b91465e0be2896f44917a50216981aba40a13d76c8ba3b8eb308342f90bfde430182e5e961a9557b93933556cb8dd11a577c273fd28325533c14374f515f311d476ad70beefc2f5dff451010e9927a830d9b1ea6653c06c6d73ee0769f2c8b72a4c96ad45ef432791ec880db54523131fd2fbea7a1809dd6a39684997e05d3274a8ae13d914db76bfb72de9cfbbeed0ed8fbc5814261ccb09994bd4d14817079d4c43daedf6059fbe1090af2eddce66d41a843c4d64531c4ed9c0c53b9388766a795b930d849b76d836c5de63f64f190ea91335cddacdb401a38e321a5867d1ee55d72b71476c842d6e047af5a7a25ffd299a03e9db171795f43d27832102596ac9f65adbf30a4f05984a3deb15cc5eb346f332c9a3259da9e93e35893db3dd0b8d762aa7c8c37ab9246f49ff3b071ea3d35fbbf7cc68b01607b4ba66b9ba6a3b55c8d051ae5b0a84c07cdb446f034c3175ac5661ad023ac2b2e038258b463ad4e7323b5254fd015de9df9763853b478015429fde2c53
[2017-11-22 16:17:53.8954] [T] [StratumClient`1] [0HL9HNN88SLQO] recv: 6b637c3c77e8a55b319ec8745cfe9245877624d8951873f96da9636ae444369aae591fe401887c230451e23b2ecda0440caed4f88ab904c748260a851a2dd282fb05c692eea577bee53f64b17fe70483740cc1051312c41d01093c63e6f907f66aaf1209bda287fb2fdb87f88e56a611f9b1beb054d66ca90d1cf517c05c51ebdc71d1efec359a2b2106795cfc1fe50d4e8c4d11e1306dc658ffdfc481f7b332189f21a62e4b4c6df3ff7d57725614d702b1c79b2e54cb26ffce17905f676afd18d39b8378c9add13678aefc096451c21890f703e8a3d5a96454aab6a3291b404f364055ec3d22b09720b4a694f7bf8d65e1c4b6c88c19743d9230e2eba5891204456bcfbdade832f850192f898c3fb3f61587c4d4f3633df332c9dbfd37eeb6149bd2bc796464238861f1c05feb8fe7289e79df2c2425917b3d57f3212294337f1848bf79548cf04b9d180322ab4c5f7e1326f18811607f6bd6cbe4fe71ff19807a5e490f63b2b7701213a153a8d266adde148f0207e70410b12543f048d4c947eb7e2f5becdc2a162a25f19d0a4e28b5438ca390c72c5d72bb94d8b32c3d69eacb1bfa44e3e61ba75cabbdfd2eafc6f9ff8e6c78f3c506fc6d83f96f68a344f41bd6753fbe1ee90316e7c2358b29b980c057681cf3f802d0be3d266724bc0e96fe12529cf262b54117b966e57966f5862106d8d9bee6343ae5bf3842154ad1ce14cdbf19e1f628fe2eb5751907d539a70361bc584cfba523ded4240343e8b9d083c2d458260089ee313ace2e2db5389a0ef1f78272c3c1a03ed5e26f86ad7f21bd7c11ac3a090b6f5a58f5d17bd49c516278d29b1059d03cf1201d7723278c78e825d2df82bd763e4b36407156b4aa07a2a76e14eaac1f790279eea8e662aa9c167698cb10f4846ec40d33df20bfb5ce1677ce9d8b558dac2821352d11968a195518f0f3471b358705ba079ca95b4ed64f81c1690a19ed29773f94c9321e802854d397"]}
[2017-11-22 16:17:53.8957] [T] [zec1] [Pool] [0HL9HNN88SLQO] Received request data: {"id":29830,"method":"mining.submit","params":["t1VaVsr8cGM5iP6fD6Qjr2mxcfZDTvjHTaM.victor","61","04a3155a","f06fe80700000000000000000000000000000000000000000000000004","fd400500327a2e58c67720612d31ed9442a8712d71f3b50e18f72dd7ca09709d822565ca67e8ffe31df1f97c3a07638d1b8303c02473a2c26bac221b71005f7904ec2a13a6e2116cbd0b76d7e8af1949948b17473c3cf60f88e91400654c199ae9060165ec99319492b6b3cb217dead0c4ecdb03ae159574d05991128488dbd5d3240cf5656bd3b4e30d0b72a5bdca1a213c297cab7c4f368e60d42d150b7381d86c3ffce816f897193ee100f78774b02a2191a172318e7c38acc2bb539b27641a86992537da56e2d82652017468d0edfde77065d613ea94997c96f793e8f37914e3d8720ef66cde380227e3d393ac3b5a1dedd792b7767b68b0c43a9945de11df2e0192e8b28bd97b91465e0be2896f44917a50216981aba40a13d76c8ba3b8eb308342f90bfde430182e5e961a9557b93933556cb8dd11a577c273fd28325533c14374f515f311d476ad70beefc2f5dff451010e9927a830d9b1ea6653c06c6d73ee0769f2c8b72a4c96ad45ef432791ec880db54523131fd2fbea7a1809dd6a39684997e05d3274a8ae13d914db76bfb72de9cfbbeed0ed8fbc5814261ccb09994bd4d14817079d4c43daedf6059fbe1090af2eddce66d41a843c4d64531c4ed9c0c53b9388766a795b930d849b76d836c5de63f64f190ea91335cddacdb401a38e321a5867d1ee55d72b71476c842d6e047af5a7a25ffd299a03e9db171795f43d27832102596ac9f65adbf30a4f05984a3deb15cc5eb346f332c9a3259da9e93e35893db3dd0b8d762aa7c8c37ab9246f49ff3b071ea3d35fbbf7cc68b01607b4ba66b9ba6a3b55c8d051ae5b0a84c07cdb446f034c3175ac5661ad023ac2b2e038258b463ad4e7323b5254fd015de9df9763853b478015429fde2c536b637c3c77e8a55b319ec8745cfe9245877624d8951873f96da9636ae444369aae591fe401887c230451e23b2ecda0440caed4f88ab904c748260a851a2dd282fb05c692eea577bee53f64b17fe70483740cc1051312c41d01093c63e6f907f66aaf1209bda287fb2fdb87f88e56a611f9b1beb054d66ca90d1cf517c05c51ebdc71d1efec359a2b2106795cfc1fe50d4e8c4d11e1306dc658ffdfc481f7b332189f21a62e4b4c6df3ff7d57725614d702b1c79b2e54cb26ffce17905f676afd18d39b8378c9add13678aefc096451c21890f703e8a3d5a96454aab6a3291b404f364055ec3d22b09720b4a694f7bf8d65e1c4b6c88c19743d9230e2eba5891204456bcfbdade832f850192f898c3fb3f61587c4d4f3633df332c9dbfd37eeb6149bd2bc796464238861f1c05feb8fe7289e79df2c2425917b3d57f3212294337f1848bf79548cf04b9d180322ab4c5f7e1326f18811607f6bd6cbe4fe71ff19807a5e490f63b2b7701213a153a8d266adde148f0207e70410b12543f048d4c947eb7e2f5becdc2a162a25f19d0a4e28b5438ca390c72c5d72bb94d8b32c3d69eacb1bfa44e3e61ba75cabbdfd2eafc6f9ff8e6c78f3c506fc6d83f96f68a344f41bd6753fbe1ee90316e7c2358b29b980c057681cf3f802d0be3d266724bc0e96fe12529cf262b54117b966e57966f5862106d8d9bee6343ae5bf3842154ad1ce14cdbf19e1f628fe2eb5751907d539a70361bc584cfba523ded4240343e8b9d083c2d458260089ee313ace2e2db5389a0ef1f78272c3c1a03ed5e26f86ad7f21bd7c11ac3a090b6f5a58f5d17bd49c516278d29b1059d03cf1201d7723278c78e825d2df82bd763e4b36407156b4aa07a2a76e14eaac1f790279eea8e662aa9c167698cb10f4846ec40d33df20bfb5ce1677ce9d8b558dac2821352d11968a195518f0f3471b358705ba079ca95b4ed64f81c1690a19ed29773f94c9321e802854d397"]}
[2017-11-22 16:17:53.8957] [T] [zec1] [Pool] [0HL9HNN88SLQO] Dispatching request mining.submit [29830]
[2017-11-22 16:17:53.9279] [T] [StratumClient`1] [0HL9HNN88SLQO] Sending: {"result":true,"error":null,"id":29830}

[2017-11-22 16:17:53.9279] [I] [zec1] [Pool] [0HL9HNN88SLQO] Share accepted: D=7.5
[2017-11-22 16:17:53.9478] [T] [StratumClient`1] [0HL9HNN88SLQO] recv: 865fe68e6bea9df86b9765f7a2aff7d727a6d3deafa00017d6a343cb0869e1c87052ae74b668b875cb597d04f2c0c32c8c1a7cb1bc835f1df1602aa14c7513791d4b24a75271cc8da952361d57e9065690933bcc0339f2dd566ee74409c477b6e31e62736201847246a9026b0b898c6000a1816873336f68fd2abb76f9093a1a9f7f4c5d18c93d0d755558f33cc9f1d57add892e0a60fde3f70b642acb588638da4cb8cb3a07ba91e92049cf209949e7c4c8c944db8de1260d67b15f74070191ef7b14456d35d30b83425e6e77b1999a30bbec09c69bb4f192acd37bf5d16875263ea0fc0df73f42106fc7dec80dfff8af33b237b569733217e0b78d2238ff87b447e43ce938eee661706a4a31f48b0fc8ee04cf5c0ea15cada1db9f320c0f3cde51ce2ece7a2018003ca2bf88f477989bc21b353d1d995b3f348af015887ded63b199d9b31c11b2ac356c9967ecfbdccf2c3cee8afb51160f5091b5618cbcc789de7d9826df010ad512ffeaad95cb182530f6b9fe5a8a08b7640c26f6bc74be0ea83f55c346d23cd4fcce9f3f9cc65b0de8eaeb8922d36df0350bc1327e094ef13b1ec7162a9335fcf618aabd40f083d921fc8c6d6562943caa061ca1c3e309e098a980667be7bb139e7d6376da1a0b1b663144d594d8fdc783536a6730f5399d732a790b48d5b12a0dbdc9cc2540c65c660c2531d21666bd1fac86ba3c97dfa6cf4ca2cf6646f33b01697a417a0441dbb611dda41cf3b730fc567f592501989da23a0535690b02d62040e457a63e6c5915525191f4871511c5cc7214b5b3fdf258718b4aa9e5e4a28dbe5be21c7f33e08fdbea5f483b81dba390dc58e5599f2acc108481c401280fedf858e52a80d7205dfc3879053f384f5f7249950f337359d652adf77e6e6d31790e041538cba18aca9735752556e40ac3c51a0f7954db161b7ef3fb72c6fca66dded62558c2d01f0204dee699"]}

[2017-11-22 16:17:53.9478] [T] [zec1] [Pool] [0HL9HNN88SLQO] Received request data: {"id":29831,"method":"mining.submit","params":["t1VaVsr8865fe68e6bea9df86b9765f7a2aff7d727a6d3deafa00017d6a343cb0869e1c87052ae74b668b875cb597d04f2c0c32c8c1a7cb1bc835f1df1602aa14c7513791d4b24a75271cc8da952361d57e9065690933bcc0339f2dd566ee74409c477b6e31e62736201847246a9026b0b898c6000a1816873336f68fd2abb76f9093a1a9f7f4c5d18c93d0d755558f33cc9f1d57add892e0a60fde3f70b642acb588638da4cb8cb3a07ba91e92049cf209949e7c4c8c944db8de1260d67b15f74070191ef7b14456d35d30b83425e6e77b1999a30bbec09c69bb4f192acd37bf5d16875263ea0fc0df73f42106fc7dec80dfff8af33b237b569733217e0b78d2238ff87b447e43ce938eee661706a4a31f48b0fc8ee04cf5c0ea15cada1db9f320c0f3cde51ce2ece7a2018003ca2bf88f477989bc21b353d1d995b3f348af015887ded63b199d9b31c11b2ac356c9967ecfbdccf2c3cee8afb51160f5091b5618cbcc789de7d9826df010ad512ffeaad95cb182530f6b9fe5a8a08b7640c26f6bc74be0ea83f55c346d23cd4fcce9f3f9cc65b0de8eaeb8922d36df0350bc1327e094ef13b1ec7162a9335fcf618aabd40f083d921fc8c6d6562943caa061ca1c3e309e098a980667be7bb139e7d6376da1a0b1b663144d594d8fdc783536a6730f5399d732a790b48d5b12a0dbdc9cc2540c65c660c2531d21666bd1fac86ba3c97dfa6cf4ca2cf6646f33b01697a417a0441dbb611dda41cf3b730fc567f592501989da23a0535690b02d62040e457a63e6c5915525191f4871511c5cc7214b5b3fdf258718b4aa9e5e4a28dbe5be21c7f33e08fdbea5f483b81dba390dc58e5599f2acc108481c401280fedf858e52a80d7205dfc3879053f384f5f7249950f337359d652adf77e6e6d31790e041538cba18aca9735752556e40ac3c51a0f7954db161b7ef3fb72c6fca66dded62558c2d01f0204dee699"]}
[2017-11-22 16:17:53.9478] [T] [zec1] [Pool] [0HL9HNN88SLQO] Dispatching request mining.submit [29831]
[2017-11-22 16:17:53.9478] [D] [StratumClient`1] [0HL9HNN88SLQO] Last subscriber disconnected from receiver stream
[2017-11-22 16:17:59.3099] [T] [StratumClient`1] [0HL9HNN88SLQO] recv: {"id":29832,"method":"mining.submit","params":["t1VaVsr8cGM5iP6fD6Qjr2mxcfZDTvjHTaM.victor","61","04a3155a","5678e80700000000000000000000000000000000000000000000000004","fd4005006ca90cf6ddbabb25c050bace785b598b86f79bfd02c793bd225692b97f84d1104967c56598c2b8b15d16d3d546ac4a9eb2afdbf17f52c4fbece45410041653180ff99e308d8dd3ee88578344527ac11c1d4f4b00a31cdbc6468dcfe469645be2e8c6a12a581504870d01b23e690e7da58680077a846074c9e34d51b7630423d1400fb5551bf5da704cdb7ccfca08cdf73e911479b79fc98bf167746c44746e42200e455874e8e103236f4636fef8cdf92c43125b21034ea233b529661e9e0d3512d3ec4732bf04db99edc701b2e9fdf1fd2936e99cf5dd4b81fb0265b0aafa2ba2ad031a062a3af4f77d4098e64ed72db66583f9a395b226739f4203897945791283c56e45c0533b43b64c72ae8872c522d7c2e047d33508af1f38aa0d4652d64247b57624119536133745548c6acea53f4aaaf92606ebfc16bc73003431816ef6f99501087860f0bfa28c317df2700533c9dda74191ca2275b3c7236ecad5931ff0b2281af1797c381a1eefed60754151af454d567735aecc0626f44ecfc804ee8b103129e7ea74d4757751a71f2be0a43f084fa8d69c6c844b00d2a95969f5cee51e0a9dcc55d127feddb3f8e690fbf046aae3e4788737130c844a58cd955b4b22929615e6b981e9fadc0dfb10764b6d38a6902f790171affba88a2291ab38db1a33e7cd33bcd0b43f3e88881da7d2d722b9dabfe4610d5ff53c8a16854bf70202b1d1b49110f4bf8c03892894849b66dbb8772d6f3486ea398eee59dfde90b62716d143af29dc5d6cac068cacf85d3dbc8ff0e80b80a29713ca25be2f84946a623f7c8e573aab7e56e0222d87d4eaa1d7edbdde82a90e9645f9fe46f1c0072b0881ec9094ad15a3067494406c525d46f531f0a723b966f97c4d3f26aeb4b85a7b5c36125c56d9892832c805d703177165a7269373fac6129a397455b72f014b5116b1757db3ec36e6bfbb5042c5c70c1c1c931bc16d4a308beaf49ed278090cc5fb7e54b89f78ce0aa692e51f8f01fdcb35b31889536ba6022a52ed603960b5620ba0c42feadec4698c37263203b1113ee20a1976e6b2b95bedcaeaf64884d04045e6023c9aa6112c248dd0748345ca5ac409787ad62b4fc81acb6e0db3a5fef6476014b10a53e939c341b546af3d877ba6d56da38e2e82e997d16aae25e2cca785c13cf3ac0c1342854e457218c6338366f4b1c6c63c16f40a4a20ed4a2666cf80cf2f74575a9553c326c838fc7b2a2e4746acc3e4bdf7cdeec378c07c1309f7cdd39bce72dadc1d6fec1f2df3aa5db81ef9208b952dbceb950d36cae58aa6e4c94b681411686804de877cf65e671986ed30db5c2b1166a3c30eaf3c64de21d2d67c210f5391e17e447460cbde022725c181f962d8ae41e6137d2fe9456856c9ca2072920ce2ae2ec7f61e34f003b5b3539754bf534e24a631d5358213710ffcb9f10631acf78dc8a33bc5d082c600767175ca056e7e08246fb4b86b9900e54741a5dadd4306a346085f16273bc16a336717a158ea1c9473642a1ddd67612f75310f0066ca50512575f54349f35e767af30023d9e90923ed14b2fa557b33e07422d4c4a700fccbf4cb93bb109630eedccd72f542d8d420edea9f3f26373d9cd0264ee4ce78d1b624b9ed9642feec2195e59cfd11440750fccd19cdc2f2ffede6cf9437363fad103e89c214e0acf415cfa85bfef5b194086f437cc9e2b8716e0aa69aa45e83969c391db8992ed9b7cb81afdfb26d169fea75e5dda96fdbbc146c3e4bb9ef0bf058d5170b5c5e341ba2d813c0fab0ee105a2db8b0a2b4caad2b134bf07cd3541caafff3968270fd3dfbec11ecd30f57fdf2b1482f0bb981f1873cf412b4d08bc778b61dca4f3de18b253b18ab42349efd6ac66ff09ef105"]}

[2017-11-22 16:18:03.3586] [T] [zec1] [Pool] Accepting connection [0HL9HNN88SLQR] from xx.xx.xx.xx:44956
[2017-11-22 16:18:03.3633] [T] [StratumClient`1] [0HL9HNN88SLQR] recv: {"id":1,"method":"mining.subscribe","params":["EWBF 0.3.4b",null,"pool.redacted.com","3072"]}

[2017-11-22 16:18:03.3634] [T] [zec1] [Pool] [0HL9HNN88SLQR] Received request data: {"id":1,"method":"mining.subscribe","params":["EWBF 0.3.4b",null,"pool.redacted.com","3072"]}
[2017-11-22 16:18:03.3634] [T] [zec1] [Pool] [0HL9HNN88SLQR] Dispatching request mining.subscribe [1]
[2017-11-22 16:18:03.3634] [T] [StratumClient`1] [0HL9HNN88SLQR] Sending: {"result":["0HL9HNN88SLQR","000004"],"error":null,"id":1}
rterbush commented 6 years ago
[2017-11-22 16:17:53.8896] [T] [StratumClient`1] [0HL9HNN88SLQO] recv: {"id":29830,"method":"mining.submit","params":["t1VaVsr8cGM5iP6fD6Qjr2mxcfZDTvjHTaM.victor","61","04a3155a","f06fe80700000000000000000000000000000000000000000000000004","fd400500327a2e58c67720612d31ed9442a8712d71f3b50e18f72dd7ca09709d822565ca67e8ffe31df1f97c3a07638d1b8303c02473a2c26bac221b71005f7904ec2a13a6e2116cbd0b76d7e8af1949948b17473c3cf60f88e91400654c199ae9060165ec99319492b6b3cb217dead0c4ecdb03ae159574d05991128488dbd5d3240cf5656bd3b4e30d0b72a5bdca1a213c297cab7c4f368e60d42d150b7381d86c3ffce816f897193ee100f78774b02a2191a172318e7c38acc2bb539b27641a86992537da56e2d82652017468d0edfde77065d613ea94997c96f793e8f37914e3d8720ef66cde380227e3d393ac3b5a1dedd792b7767b68b0c43a9945de11df2e0192e8b28bd97b91465e0be2896f44917a50216981aba40a13d76c8ba3b8eb308342f90bfde430182e5e961a9557b93933556cb8dd11a577c273fd28325533c14374f515f311d476ad70beefc2f5dff451010e9927a830d9b1ea6653c06c6d73ee0769f2c8b72a4c96ad45ef432791ec880db54523131fd2fbea7a1809dd6a39684997e05d3274a8ae13d914db76bfb72de9cfbbeed0ed8fbc5814261ccb09994bd4d14817079d4c43daedf6059fbe1090af2eddce66d41a843c4d64531c4ed9c0c53b9388766a795b930d849b76d836c5de63f64f190ea91335cddacdb401a38e321a5867d1ee55d72b71476c842d6e047af5a7a25ffd299a03e9db171795f43d27832102596ac9f65adbf30a4f05984a3deb15cc5eb346f332c9a3259da9e93e35893db3dd0b8d762aa7c8c37ab9246f49ff3b071ea3d35fbbf7cc68b01607b4ba66b9ba6a3b55c8d051ae5b0a84c07cdb446f034c3175ac5661ad023ac2b2e038258b463ad4e7323b5254fd015de9df9763853b478015429fde2c53
[2017-11-22 16:17:53.8954] [T] [StratumClient`1] [0HL9HNN88SLQO] recv: 6b637c3c77e8a55b319ec8745cfe9245877624d8951873f96da9636ae444369aae591fe401887c230451e23b2ecda0440caed4f88ab904c748260a851a2dd282fb05c692eea577bee53f64b17fe70483740cc1051312c41d01093c63e6f907f66aaf1209bda287fb2fdb87f88e56a611f9b1beb054d66ca90d1cf517c05c51ebdc71d1efec359a2b2106795cfc1fe50d4e8c4d11e1306dc658ffdfc481f7b332189f21a62e4b4c6df3ff7d57725614d702b1c79b2e54cb26ffce17905f676afd18d39b8378c9add13678aefc096451c21890f703e8a3d5a96454aab6a3291b404f364055ec3d22b09720b4a694f7bf8d65e1c4b6c88c19743d9230e2eba5891204456bcfbdade832f850192f898c3fb3f61587c4d4f3633df332c9dbfd37eeb6149bd2bc796464238861f1c05feb8fe7289e79df2c2425917b3d57f3212294337f1848bf79548cf04b9d180322ab4c5f7e1326f18811607f6bd6cbe4fe71ff19807a5e490f63b2b7701213a153a8d266adde148f0207e70410b12543f048d4c947eb7e2f5becdc2a162a25f19d0a4e28b5438ca390c72c5d72bb94d8b32c3d69eacb1bfa44e3e61ba75cabbdfd2eafc6f9ff8e6c78f3c506fc6d83f96f68a344f41bd6753fbe1ee90316e7c2358b29b980c057681cf3f802d0be3d266724bc0e96fe12529cf262b54117b966e57966f5862106d8d9bee6343ae5bf3842154ad1ce14cdbf19e1f628fe2eb5751907d539a70361bc584cfba523ded4240343e8b9d083c2d458260089ee313ace2e2db5389a0ef1f78272c3c1a03ed5e26f86ad7f21bd7c11ac3a090b6f5a58f5d17bd49c516278d29b1059d03cf1201d7723278c78e825d2df82bd763e4b36407156b4aa07a2a76e14eaac1f790279eea8e662aa9c167698cb10f4846ec40d33df20bfb5ce1677ce9d8b558dac2821352d11968a195518f0f3471b358705ba079ca95b4ed64f81c1690a19ed29773f94c9321e802854d397"]}
[2017-11-22 16:17:53.9478] [T] [StratumClient`1] [0HL9HNN88SLQO] recv: 865fe68e6bea9df86b9765f7a2aff7d727a6d3deafa00017d6a343cb0869e1c87052ae74b668b875cb597d04f2c0c32c8c1a7cb1bc835f1df1602aa14c7513791d4b24a75271cc8da952361d57e9065690933bcc0339f2dd566ee74409c477b6e31e62736201847246a9026b0b898c6000a1816873336f68fd2abb76f9093a1a9f7f4c5d18c93d0d755558f33cc9f1d57add892e0a60fde3f70b642acb588638da4cb8cb3a07ba91e92049cf209949e7c4c8c944db8de1260d67b15f74070191ef7b14456d35d30b83425e6e77b1999a30bbec09c69bb4f192acd37bf5d16875263ea0fc0df73f42106fc7dec80dfff8af33b237b569733217e0b78d2238ff87b447e43ce938eee661706a4a31f48b0fc8ee04cf5c0ea15cada1db9f320c0f3cde51ce2ece7a2018003ca2bf88f477989bc21b353d1d995b3f348af015887ded63b199d9b31c11b2ac356c9967ecfbdccf2c3cee8afb51160f5091b5618cbcc789de7d9826df010ad512ffeaad95cb182530f6b9fe5a8a08b7640c26f6bc74be0ea83f55c346d23cd4fcce9f3f9cc65b0de8eaeb8922d36df0350bc1327e094ef13b1ec7162a9335fcf618aabd40f083d921fc8c6d6562943caa061ca1c3e309e098a980667be7bb139e7d6376da1a0b1b663144d594d8fdc783536a6730f5399d732a790b48d5b12a0dbdc9cc2540c65c660c2531d21666bd1fac86ba3c97dfa6cf4ca2cf6646f33b01697a417a0441dbb611dda41cf3b730fc567f592501989da23a0535690b02d62040e457a63e6c5915525191f4871511c5cc7214b5b3fdf258718b4aa9e5e4a28dbe5be21c7f33e08fdbea5f483b81dba390dc58e5599f2acc108481c401280fedf858e52a80d7205dfc3879053f384f5f7249950f337359d652adf77e6e6d31790e041538cba18aca9735752556e40ac3c51a0f7954db161b7ef3fb72c6fca66dded62558c2d01f0204dee699"]}

[2017-11-22 16:17:53.9478] [T] [zec1] [Pool] [0HL9HNN88SLQO] Received request data: {"id":29831,"method":"mining.submit","params":["t1VaVsr8865fe68e6bea9df86b9765f7a2aff7d727a6d3deafa00017d6a343cb0869e1c87052ae74b668b875cb597d04f2c0c32c8c1a7cb1bc835f1df1602aa14c7513791d4b24a75271cc8da952361d57e9065690933bcc0339f2dd566ee74409c477b6e31e62736201847246a9026b0b898c6000a1816873336f68fd2abb76f9093a1a9f7f4c5d18c93d0d755558f33cc9f1d57add892e0a60fde3f70b642acb588638da4cb8cb3a07ba91e92049cf209949e7c4c8c944db8de1260d67b15f74070191ef7b14456d35d30b83425e6e77b1999a30bbec09c69bb4f192acd37bf5d16875263ea0fc0df73f42106fc7dec80dfff8af33b237b569733217e0b78d2238ff87b447e43ce938eee661706a4a31f48b0fc8ee04cf5c0ea15cada1db9f320c0f3cde51ce2ece7a2018003ca2bf88f477989bc21b353d1d995b3f348af015887ded63b199d9b31c11b2ac356c9967ecfbdccf2c3cee8afb51160f5091b5618cbcc789de7d9826df010ad512ffeaad95cb182530f6b9fe5a8a08b7640c26f6bc74be0ea83f55c346d23cd4fcce9f3f9cc65b0de8eaeb8922d36df0350bc1327e094ef13b1ec7162a9335fcf618aabd40f083d921fc8c6d6562943caa061ca1c3e309e098a980667be7bb139e7d6376da1a0b1b663144d594d8fdc783536a6730f5399d732a790b48d5b12a0dbdc9cc2540c65c660c2531d21666bd1fac86ba3c97dfa6cf4ca2cf6646f33b01697a417a0441dbb611dda41cf3b730fc567f592501989da23a0535690b02d62040e457a63e6c5915525191f4871511c5cc7214b5b3fdf258718b4aa9e5e4a28dbe5be21c7f33e08fdbea5f483b81dba390dc58e5599f2acc108481c401280fedf858e52a80d7205dfc3879053f384f5f7249950f337359d652adf77e6e6d31790e041538cba18aca9735752556e40ac3c51a0f7954db161b7ef3fb72c6fca66dded62558c2d01f0204dee699"]}
oliverw commented 6 years ago

@rterbush The "recv" log is from a lower level where individual request fragments are logged, so this request must have arrived in multiple packets. I would be interested to see the recv log entries from the same connection just before the request.

rterbush commented 6 years ago

@oliverw I've edited the above posts to include a bit more of the preceding data.

rterbush commented 6 years ago

with all of this data here... I see the following:

[2017-11-22 16:17:53.8954] seems to send the end of a complete json message. [2017-11-22 16:17:53.9478] the next recv: buffer seems to be missing some container info.

oliverw commented 6 years ago

Debugging ...

oliverw commented 6 years ago

@rterbush I've fixed multiple problems with the receiver logic. Please re-open if you still observe disconnects with the latest code.

rterbush commented 6 years ago

Thanks for the hard work. The fix looks spot on. Will keep you posted.

rterbush commented 6 years ago

Going to have to reopen this:

While the changes do seem to quiet things for the previous settings I was running for VarDiff, the failures reappear and are significantly worse if I run on ZEC with the following VarDiff settings. (has always been reported on ZEC)

 "ports": {
        "3072": {
          "listenAddress": "0.0.0.0",
          "difficulty": 1.0,
          "varDiff": {
            "minDiff": 0.04,
            "maxDiff": null,
            "targetTime": 15,
            "retargetTime": 90,
            "variancePercent": 40,
            "maxDelta": 0.5
          }

I had been running with variancePercent of 30 and maxDelta at 1.

Changing to the settings above shows the disconnect error appearing every few minutes.

rterbush commented 6 years ago

Also rolling back c51b9ed0c212e0b1d1cbe31b8c52edb416b984d6 as that could also be further confusing this. Will report back...

rterbush commented 6 years ago

The issue is recreated on the rolled back commit. I'm switching to fixed diff for awhile to see if that quiets this.

oliverw commented 6 years ago

@rterbush I'm confused did you mean that c51b9ed reintroduced the problem?

rterbush commented 6 years ago

@oliverw no, but I included it in the deploy of the updated code initially. Rolled back to your previous changes alone to make sure that the problems still existed after applying only 437daa4785261b0ef2f384d05c9ebd87609b492d

The disconnect messages are significantly less frequent when not running on VarDiff, but have still occurred over last 30 min.

oliverw commented 6 years ago

@rterbush Could you update to the latest commit and run with debug logging. This time we should be able to extract much more information from the log.

rterbush commented 6 years ago

Ok, I have updated with your latest commit. Will report back.

rterbush commented 6 years ago

Captured error:

[2017-11-23 22:58:59.1125] [I] [zec1] [Pool] [0HL9IOKVU0SDO] Share accepted: D=1.5 
[2017-11-23 22:58:59.1290] [D] [zec1] [Bitcoin Job Manager] UpdateJob() 
[2017-11-23 22:58:59.1290] [D] [zec1] [Bitcoin Job Manager] GetBlockTemplateAsync() 
[2017-11-23 22:58:59.1290] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocksubsidy) 
[2017-11-23 22:58:59.1301] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocktemplate) 
[2017-11-23 22:58:59.1537] [E] [zec1] [Pool] [0HL9IOKVU0SDO] Connection json error state: Unexpected character encountered while parsing value: i. Path '', line 0, position 0., path:  
[2017-11-23 22:58:59.1537] [D] [StratumClient`1] [0HL9IOKVU0SDO] Last subscriber disconnected from receiver stream 

Another:

[2017-11-24 02:28:36.3288] [D] [zec1] [Pool] [0HL9IOKVU0SDQ] Dispatching request 'mining.submit' [1426]
[2017-11-24 02:28:36.3288] [D] [zec1] [Bitcoin Job Manager] SubmitShareAsync(0HL9IOKVU0SDQ)
[2017-11-24 02:28:36.3288] [D] [EquihashSolver] Verify()
[2017-11-24 02:28:36.3298] [I] [zec1] [Pool] [0HL9IOKVU0SDQ] Share accepted: D=7.5
[2017-11-24 02:28:36.3782] [E] [zec1] [Pool] [0HL9IOKVU0SDQ] Connection json error state: Unexpected character encountered while parsing number: ". Path '', line 1, position 1416., path:
[2017-11-24 02:28:36.3782] [D] [StratumClient`1] [0HL9IOKVU0SDQ] Last subscriber disconnected from receiver stream

Another:

[2017-11-23 23:26:42.5326] [D] [zec1] [Pool] [0HL9IOKVU0SDR] Dispatching request 'mining.submit' [126]
[2017-11-23 23:26:42.5326] [D] [zec1] [Bitcoin Job Manager] SubmitShareAsync(0HL9IOKVU0SDR)
[2017-11-23 23:26:42.5326] [D] [EquihashSolver] Verify()
[2017-11-23 23:26:42.5337] [I] [zec1] [Pool] [0HL9IOKVU0SDR] Share accepted: D=1
[2017-11-23 23:26:42.5337] [E] [zec1] [Pool] [0HL9IOKVU0SDR] Connection json error state: Unexpected character encountered while parsing number: ". Path '', line 1, position 184., path:
[2017-11-23 23:26:42.5337] [D] [StratumClient`1] [0HL9IOKVU0SDR] Last subscriber disconnected from receiver stream
oliverw commented 6 years ago

Ok we at least now that there are still JSON parsing errors happening. We just don't know why. I suspect another subtle bug in the StratumClient but was unable to provoke errors like those in isolated testing.

Too bad that I've told you to run with "debug" logging instead of "trace" because we are missing the actual data from your latest tests. 😢

rterbush commented 6 years ago

Here is a bit more data. I'll capture a few of these to see how they compare as I capture them.

In the past hour, I've had 4 of these disconnects and they appear to all be truncating the payload at the ubmit in the mining.submit method message. Will keep monitoring for variations.

[2017-11-24 14:10:15.4239] [D] [zec1] [Pool] [0HL9J8E1PPQH2] Dispatching request 'mining.submit' [6955]
[2017-11-24 14:10:15.4239] [D] [zec1] [Bitcoin Job Manager] SubmitShareAsync(0HL9J8E1PPQH2)
[2017-11-24 14:10:15.4239] [D] [EquihashSolver] Verify()
[2017-11-24 14:10:15.4250] [T] [StratumClient`1] [0HL9J8E1PPQH2] Sending: {"result":true,"error":null,"id":6955}
[2017-11-24 14:10:15.4285] [T] [StratumClient`1] [0HL9J8E1PPQH2] recv: ubmit","params":["t1VaVsr8cGM5iP6fD6Qjr2mxcfZDTvjHTaM.tincup","16","2e28185a","abfc300100000000000000000000000000000000000000000000000006","fd400500155cb8b0841361aca9624d581673126047d6cdfe0932760625cb36baf4845331b243666601ba983c73056e939edc426ca22cced0f7fe90ba2ee49c3755c44503bdd20c14e4253579d4ef0aeb210981d94f16e413bf209eedd6f563ddcc92ee70262f88e3ab2bde75188d04eebf63f1b3fb7e51f3a63d0432edbbde16e31d1267fb44ea2f7b5ad0f54a02516f39bf5c393ba92354ed61c7d6f0519b8972f8a6f472ae5bd27f61a4013cbae1d6e49a2927b3649ceb424159f43bbccbf007662daa1babe397cac6717c9c4e6d19c2c2f7e83e02db52299a83ecafb9b692c335ef7c7120befc8aa70feeee284c2cd46986c5824759fed5f4c2f8964b7503185f96dc9a6527ba3ff32ca0c6c8f96fed0e37f6051b1b4732660daf7521d18f78e029546ab67775e81091f1588923d8c1920a923c17b430f1a6265439783acb3be2fb2253b15b2d19ae9cff4edea10357465d0360bfa94e42f3b35515456de5cf3d09ab2fb1f7a90edf16c830e7359f57d0d34d5bc0f1a5b747b752c705011c2cd2d3a148e2dea64441d838467e4e7e120f0d25169e700af6e8929cb15b9d390acefeb6bb72370b729e451fe3308d21a1c7fe575e5cbe02401aff2422a7aac4a0da8b075cf1b39508d6ee2ceaddd73e590dbdbb2ee5106e97aca221b14a5169161d4e93804f26aa1ad81753535d0e18a4185f586beb821a7c70c404ee05e373d11b8f322ed238091230d8e4a37e3cf10ac683a37412501341dcd0b119ee0cb8e32a2fa78415f492dae067f41798cfb177391453997e55b7a764381915ab239d9265606af7965dcfc29636dc7ac1130bd7e7cce88f6fab3bde8278f55fab9715939904cc2646cdeb354c0d18abae983500e312823773be62c30f09b7724c04ee7ed3b3a59f5fcc7165bf5aced088
[2017-11-24 14:10:15.4719] [T] [StratumClient`1] [0HL9J8E1PPQH2] recv: 1476b69e8052e9b77a9374908162faa36ff2de207302f85fc795080cb677eba1d5883c747ce86394135804b8d5ddafd335c0ffac4c42c5741a736d83bf89c8078827620bd86077bb6062572a7477789c5495b23d0a9294c722c3c334743d43f1bf525421e222da9e1b0c2b155bfd16df3edeb605f9fc483e2f1b6e1dd2961b7d4b8e6b4ee77093f3769381fbd72360fb5b2e060df2be9f5edafcd98b21b1efa0774c92a5af5d2f4d3b2db36ad3dfaadd687bf8b3457e7d1273d37f4adc03a5becaf4c186f6db7700baf0f03ac9356c6adf6b2fb123a22caaa4fff25ac60839f510b6ea28da7e0a05764c08cd5513597e38c96c134b6f4ea34c1fc33409af60a11d87c891412fc7a24c61d885eaae8fd0b30bb419a0a05ad5339582a43539e9a9b9a946edd7010c887bddfba49669422e42fdbead2ec67e859f3d761c4bae283e0c04d2f8b3628044482c198a5f6e0c6425d5571dd5f62521e52f84f53b5a5352deb75db4e5036fe1f53695fca9c506979014fd7696c4a4172d5d3dd0d56e18a44213e6dd898344fe6d0a9478fe3133059c25f03a2d6047ab18289560d7ced6a39716e21a1e8c334ba48aad975babe68201e38ebac1e6bd1f0d03aab223b8b7e83ffada45ed11c9b1a5876e1d51023dfc5eed48d360cacda804341d48a3160e52312c3d2498b329acec9d2d961482c5e64493a9aecb6f47855236ad7cbc97fbdae9a0f54ea9649eba143b333ba7040d3acb92dac4a1df6cc9168c73c506715d3bc1a613f426001347578f759cd5383c4466a20b185c7f6c0bd598abb5c4d88b68f070db68d49c5188f0f141390f00234a6033c0fbf446b1954a0f0016fb239eef10046f8db30343d764219fe30b385c1c6d8ddc777acd4d8455888ff56397b501654e004fc1c2ea4f3ae04a04efc4f97c978949444c80e6f8da2ff5fdd0b83c21502eb4394b68cc0563a665bfb7af4221cda5913143"]}
[2017-11-24 14:10:15.4719] [T] [zec1] [Pool] [0HL9J8E1PPQH2] Received request data: ubmit","params":["t1VaVsr8cGM5iP6fD6Qjr2mxcfZDTvjHTaM.tincup","16","2e28185a","abfc300100000000000000000000000000000000000000000000000006","fd400500155cb8b0841361aca9624d581673126047d6cdfe0932760625cb36baf4845331b243666601ba983c73056e939edc426ca22cced0f7fe90ba2ee49c3755c44503bdd20c14e4253579d4ef0aeb210981d94f16e413bf209eedd6f563ddcc92ee70262f88e3ab2bde75188d04eebf63f1b3fb7e51f3a63d0432edbbde16e31d1267fb44ea2f7b5ad0f54a02516f39bf5c393ba92354ed61c7d6f0519b8972f8a6f472ae5bd27f61a4013cbae1d6e49a2927b3649ceb424159f43bbccbf007662daa1babe397cac6717c9c4e6d19c2c2f7e83e02db52299a83ecafb9b692c335ef7c7120befc8aa70feeee284c2cd46986c5824759fed5f4c2f8964b7503185f96dc9a6527ba3ff32ca0c6c8f96fed0e37f6051b1b4732660daf7521d18f78e029546ab67775e81091f1588923d8c1920a923c17b430f1a6265439783acb3be2fb2253b15b2d19ae9cff4edea10357465d0360bfa94e42f3b35515456de5cf3d09ab2fb1f7a90edf16c830e7359f57d0d34d5bc0f1a5b747b752c705011c2cd2d3a148e2dea64441d838467e4e7e120f0d25169e700af6e8929cb15b9d390acefeb6bb72370b729e451fe3308d21a1c7fe575e5cbe02401aff2422a7aac4a0da8b075cf1b39508d6ee2ceaddd73e590dbdbb2ee5106e97aca221b14a5169161d4e93804f26aa1ad81753535d0e18a4185f586beb821a7c70c404ee05e373d11b8f322ed238091230d8e4a37e3cf10ac683a37412501341dcd0b119ee0cb8e32a2fa78415f492dae067f41798cfb177391453997e55b7a764381915ab239d9265606af7965dcfc29636dc7ac1130bd7e7cce88f6fab3bde8278f55fab9715939904cc2646cdeb354c0d18abae983500e312823773be62c30f09b7724c04ee7ed3b3a59f5fcc7165bf5aced0881476b69e8052e9b77a9374908162faa36ff2de207302f85fc795080cb677eba1d5883c747ce86394135804b8d5ddafd335c0ffac4c42c5741a736d83bf89c8078827620bd86077bb6062572a7477789c5495b23d0a9294c722c3c334743d43f1bf525421e222da9e1b0c2b155bfd16df3edeb605f9fc483e2f1b6e1dd2961b7d4b8e6b4ee77093f3769381fbd72360fb5b2e060df2be9f5edafcd98b21b1efa0774c92a5af5d2f4d3b2db36ad3dfaadd687bf8b3457e7d1273d37f4adc03a5becaf4c186f6db7700baf0f03ac9356c6adf6b2fb123a22caaa4fff25ac60839f510b6ea28da7e0a05764c08cd5513597e38c96c134b6f4ea34c1fc33409af60a11d87c891412fc7a24c61d885eaae8fd0b30bb419a0a05ad5339582a43539e9a9b9a946edd7010c887bddfba49669422e42fdbead2ec67e859f3d761c4bae283e0c04d2f8b3628044482c198a5f6e0c6425d5571dd5f62521e52f84f53b5a5352deb75db4e5036fe1f53695fca9c506979014fd7696c4a4172d5d3dd0d56e18a44213e6dd898344fe6d0a9478fe3133059c25f03a2d6047ab18289560d7ced6a39716e21a1e8c334ba48aad975babe68201e38ebac1e6bd1f0d03aab223b8b7e83ffada45ed11c9b1a5876e1d51023dfc5eed48d360cacda804341d48a3160e52312c3d2498b329acec9d2d961482c5e64493a9aecb6f47855236ad7cbc97fbdae9a0f54ea9649eba143b333ba7040d3acb92dac4a1df6cc9168c73c506715d3bc1a613f426001347578f759cd5383c4466a20b185c7f6c0bd598abb5c4d88b68f070db68d49c5188f0f141390f00234a6033c0fbf446b1954a0f0016fb239eef10046f8db30343d764219fe30b385c1c6d8ddc777acd4d8455888ff56397b501654e004fc1c2ea4f3ae04a04efc4f97c978949444c80e6f8da2ff5fdd0b83c21502eb4394b68cc0563a665bfb7af4221cda5913143"]}
[2017-11-24 14:10:15.4810] [E] [zec1] [Pool] [0HL9J8E1PPQH2] Connection json error state: Error parsing undefined value. Path '', line 1, position 1., path:
[2017-11-24 14:10:15.4852] [D] [StratumClient`1] [0HL9J8E1PPQH2] Last subscriber disconnected from receiver stream
rterbush commented 6 years ago

Adding a bit more info... this issue is made to be worse when there is more connection activity. At low diff levels, it is more likely to happen. If the connections make it past the lower diff levels and manage to get the difficulty up to higher levels, the connections become more stable as there is less interaction with the pool.

oliverw commented 6 years ago

@rterbush VarDiff could only indirectly influence the occurrence of this problem. There must be still something wrong in the StratumClient's line assembly logic under certain circumstances. /scratches head/

rterbush commented 6 years ago

Agreed. Not suggesting that VarDiff is the problem, but forcing a low diff might make it easier to recreate.

rterbush commented 6 years ago

Another flurry of disconnects:

[2017-11-24 17:21:51.8008] [T] [zec1] [Pool] [0HL9J8E1PPQHA] Received request data: {"id":8261,"method":"mining.submit","params":["t1VaVsr8cGM5iP6fD6Qjr2mxcfZDTvjHTaM.tincup","267","2155185a","a8fb720100000000000000000000000000000000000000000000000005","fd40050075ad9060c6922ad79cf4e11e6867e23eb172510a32a5cd217aedc69dcb0c540e8b4eda6722c59c0d221134bfd33216a8e9e07d77657a4a5a9ae76a9e66372369b98dc4cd6e871d02327964395eed91d6edce58009c6185b103375a9e4c4205602b7600c01368006c2cf3bbb53e63ede3b1df88542fd965a61c995c1bae11e7d55c70cf97708da922b1a2a8b2c8e152f6238b2f9f979b0b1c9c60ffc9a85d12550e1684987e1414013c19580e9f2a7794cf4109d22255786c604979e0026d4e6c7f3d6ef7ecf354c68ed642d5e3d59ab7fc02989b41a4ca7db1dace016469545a3ead1d9d32cd088993c42f065d3add5c62e20c524b79510fd5ce0b06ebdfdce85ea3930230e442e8ac0b86c86a3b5dc126b7850774cc3899173fc740b8ccf86e8b3a1d2e8a07d5ae27260c9e8d94cda62ce9c68865b08f1a5d732b91b46666659abb9d62867813dfe0b2549178d5a10495f058330e5b5327a9e3d6bc22813d15338f6898296dc7b9ce32d34bacf95540b0aba375d987568b3909dceb98d10f9b896e45a3238ddfb7b60c38f684b919fd95e83fca4c0296e1a1e8714a9478b806b07d1e06d8525e09226783d19201825e391a3f01981c56b10db9dabf911454b52d3561230748afaa552fbc01092ab457e355a6944d8e23d35afee9bb1168f37e33882c1be2245090120b70693ac880785f6fc2d81f82ac0702086e2bea0cf9f9d35233badb63a8a826fa4331129c7a463b4c2427a95552aef9e55ac0db0f1fbf770a03b25994e3ca1fd666519c87cc593d46123ca7c30c59554d9c8cd5919a08e13d58dcad55e874bc94471b5c74abeb9c7af7ba4f22106eacac897436ed609f40dd771939ea5d4359e0944a2fee0819a7fef378f53779ee11a4a6e5dff021e590c1e51439d75093f0a044291a323566486576ca7547b2be52c23b18396cc30125317005cfc3c7600063e6d4fba67d7c4bd99d891c313582b0d88200cae643b15175b02df99b760ce5386c241af053a7156d4a63ef4f2c4200ff00095d8947ac3c8d8cd74417fe1735d984d900599dfe7783dc016bab734956272d65fe63631da0c9058cf3b3e1e00aa373df4f89a893d592433f4e3211cec2b0377263079f194f13d2fcdfd34f5526176df68568f72c8b8d09a2cdbcd9913c5f1e4a9422dfbc93d5ab0ef21ccd0240538d738ab2c9bb5be103cf2169cc904f8fb9160542c64c2b83ac5f3901a357c52223b5b8876e330c04a8a4c812bd88d1f25cf096f52c1b19d0192fe5590d6a157ad38530bab84b016ea9579a088bbcf7fe0e09a43fe9c2d7aa395e82f12b30d892d882758be8e411b68b98f5a3340ff468a4ad006e5521af5a6e5d400e2971aa9103a26642964ae8ef7572b75f70db3cc55b16dbe6cb5ebd5118dce698f77d675e83daf8300603bb2c8e6ca00145795246c352c5c6b2e888fbf69219f29f63824d92a69370d3afca5198bf9ad19eb3b714ac6964c89d96d1ced606321ff00e01e744d9be845461f62772da4bfd602b3861fc5ab59e1916fc9a1504026b25aa9aee04efaa954c3564f7d305ab59937204ed5b83c2aee7eba9f4c21af9cdc6bd4ca393de5f12a43f8cb0a07e5578db71d7c571f22a9ac8dc4de01c4d165ba3f45791c9cdb68de37b426e598cd52ed605e6aea6ff0fd3032495432ff2fe054ebce85f4cc907d298a26fd0498554bca8789a70f63eab3f788ecd2ef62b1be3a4791733e2e3305cc722a5cc95d6d8a74757ed50387765c1f177e6479956146d9d6aad93f20e668ab4e022aab57ce054146d2cd73e5468fb39d54c6d2ea07aa549cf6ae409c4ef66503b1b715e462a100022c8e1cf6c818ae59118f6bdce554b58f342cd1c93115948ebf9b9dbd1e7b52cdb08163cb2ba7b07"]}
[2017-11-24 17:21:51.8008] [D] [zec1] [Pool] [0HL9J8E1PPQHA] Dispatching request 'mining.submit' [8261]
[2017-11-24 17:21:51.8008] [D] [zec1] [Bitcoin Job Manager] SubmitShareAsync(0HL9J8E1PPQHA)
[2017-11-24 17:21:51.8016] [D] [EquihashSolver] Verify()
[2017-11-24 17:21:51.8016] [T] [StratumClient`1] [0HL9J8E1PPQHA] Sending: {"result":true,"error":null,"id":8261}
[2017-11-24 17:21:51.8016] [I] [zec1] [Pool] [0HL9J8E1PPQHA] Share accepted: D=1.5
[2017-11-24 17:21:51.8472] [T] [StratumClient`1] [0HL9J8E1PPQHA] recv: 6cd759ed9976eb67bf9ef42038bc693eaf65fd90e710934ab5e4418e76881dc29c7b30d701b5f6f242f7bc51cb4e17960fda6cdb2e47bf63600dcf1783c656c5e7a9c9220e14a6a568e5a06df86f16e16aa63a4fe8f1fa00725d6ad329528e84fad8452aa611d8d4700cd5f3489490bcd1f6b60485fdadff0370d4fad20802cbbc9861374cc9f37e10f49dbca526a855f6a6e121ad276e74248be4e79e97467bc3ca05b4350624b04f3dfd00c42c906f9732493d17cc1a2873763efb5583430df1a6296a66f4e6422ad30a07045673f83b8c1f22b8ae0068d7499b6025550451f7180624eccc5134c2daad726be5e368f96894b7fe2024d5fdc00654267995bf45dbc86841eaa0169742a0511457b3b6faee2bf3eed53d627ea7dd92d39d8e550da8bddc1f92ea58dbc86b0468f147b6d82cf938504ebef6f95c6b8c0ba4356552fd5712024a0a76449139e1aa5d32503e71523ac46b32bbcc2a440ff9fc8242113b16c4515229665f84e0d8bea9ae00fe92321206f6aa3a86283caff1ebc78e4cee18b342af5f7956071e862d1cfc0805e0a540becc96915198a21133490845f2c90e528ec8a6cfc0dba5d123fec0213ed97e40388b66ecd920adff83b6c745cf38a5e8935fd3b90f5b339edd97e11ede7b31f6a73dff8605773c86e51d3ee927a08da70eb965822f2f69c9c0a4c871a81d1c1701172f6ceecdbd9388614cde31326e305a04c7234c23b4e916f8df42c3b3b01667113567603f6f8d080f22f98a0d59976335c0ccce417df956d498ba383f87a4aa2ee22919c8566677cac56659d223cf19e7121f9e3d96d7442b3a46a903cfcca9ee5cdb7ba1e443e66a630033d3a9c8ff64c3f64663e62eff7f2a33198acda9f4ae0f95ecf76385537d751610c6d179bd3fa85349b8520e43f64c6789b26b7ef2e8763833591ba2c44a190fabbee75d63e88a36ac21487f3cd115344df3bb24f19efb90b9764663b90d6dbbd2ee9b35"]}
[2017-11-24 17:21:51.8472] [T] [zec1] [Pool] [0HL9J8E1PPQHA] Received request data: b16694ceef22eb8280338274b6b6598ab6d14b1d0f2d989f6ec5983544ad613e684be4c061edb7a96926c3ea98db584ad34a4c3353522a64be3d7457ddd90d216312439d8f2545ad30eaf4508b65e7da7a95914cdcffca1da1ee39170106c85e529dbeffe05d60c710f0e94868e76cd759ed9976eb67bf9ef42038bc693eaf65fd90e710934ab5e4418e76881dc29c7b30d701b5f6f242f7bc51cb4e17960fda6cdb2e47bf63600dcf1783c656c5e7a9c9220e14a6a568e5a06df86f16e16aa63a4fe8f1fa00725d6ad329528e84fad8452aa611d8d4700cd5f3489490bcd1f6b60485fdadff0370d4fad20802cbbc9861374cc9f37e10f49dbca526a855f6a6e121ad276e74248be4e79e97467bc3ca05b4350624b04f3dfd00c42c906f9732493d17cc1a2873763efb5583430df1a6296a66f4e6422ad30a07045673f83b8c1f22b8ae0068d7499b6025550451f7180624eccc5134c2daad726be5e368f96894b7fe2024d5fdc00654267995bf45dbc86841eaa0169742a0511457b3b6faee2bf3eed53d627ea7dd92d39d8e550da8bddc1f92ea58dbc86b0468f147b6d82cf938504ebef6f95c6b8c0ba4356552fd5712024a0a76449139e1aa5d32503e71523ac46b32bbcc2a440ff9fc8242113b16c4515229665f84e0d8bea9ae00fe92321206f6aa3a86283caff1ebc78e4cee18b342af5f7956071e862d1cfc0805e0a540becc96915198a21133490845f2c90e528ec8a6cfc0dba5d123fec0213ed97e40388b66ecd920adff83b6c745cf38a5e8935fd3b90f5b339edd97e11ede7b31f6a73dff8605773c86e51d3ee927a08da70eb965822f2f69c9c0a4c871a81d1c1701172f6ceecdbd9388614cde31326e305a04c7234c23b4e916f8df42c3b3b01667113567603f6f8d080f22f98a0d59976335c0ccce417df956d498ba383f87a4aa2ee22919c8566677cac56659d223cf19e7121f9e3d96d7442b3a46a903cfcca9ee5cdb7ba1e443e66a630033d3a9c8ff64c3f64663e62eff7f2a33198acda9f4ae0f95ecf76385537d751610c6d179bd3fa85349b8520e43f64c6789b26b7ef2e8763833591ba2c44a190fabbee75d63e88a36ac21487f3cd115344df3bb24f19efb90b9764663b90d6dbbd2ee9b35"]}
[2017-11-24 17:21:51.8472] [E] [zec1] [Pool] [0HL9J8E1PPQHA] Connection json error state: Unexpected character encountered while parsing value: b. Path '', line 0, position 0., path:
[2017-11-24 17:21:51.8472] [D] [StratumClient`1] [0HL9J8E1PPQHA] Last subscriber disconnected from receiver stream
rterbush commented 6 years ago

Here is connection that did not get very far before sending EOF.

[2017-11-24 17:20:50.2487] [D] [zec1] [Pool] Accepting connection [0HL9J8E1PPQH9] from xx.xx.xx.xx:47607
[2017-11-24 17:20:50.3474] [D] [zec1] [Bitcoin Job Manager] UpdateJob()
[2017-11-24 17:20:50.3474] [D] [zec1] [Bitcoin Job Manager] GetBlockTemplateAsync()
[2017-11-24 17:20:50.3474] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocksubsidy)
[2017-11-24 17:20:50.3487] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocktemplate)
[2017-11-24 17:20:51.3472] [D] [zec1] [Bitcoin Job Manager] UpdateJob()
[2017-11-24 17:20:51.3472] [D] [zec1] [Bitcoin Job Manager] GetBlockTemplateAsync()
[2017-11-24 17:20:51.3472] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocksubsidy)
[2017-11-24 17:20:51.3472] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocktemplate)
[2017-11-24 17:20:52.3470] [D] [zec1] [Bitcoin Job Manager] UpdateJob()
[2017-11-24 17:20:52.3470] [D] [zec1] [Bitcoin Job Manager] GetBlockTemplateAsync()
[2017-11-24 17:20:52.3470] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocksubsidy)
[2017-11-24 17:20:52.3482] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocktemplate)
[2017-11-24 17:20:53.3467] [D] [zec1] [Bitcoin Job Manager] UpdateJob()
[2017-11-24 17:20:53.3467] [D] [zec1] [Bitcoin Job Manager] GetBlockTemplateAsync()
[2017-11-24 17:20:53.3467] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocksubsidy)
[2017-11-24 17:20:53.3479] [D] [DaemonClient] ExecuteCmdAnyAsync(getblocktemplate)
[2017-11-24 17:20:53.7384] [D] [zec1] [Pool] [0HL9J8E1PPQH9] Received EOF
[2017-11-24 17:20:53.7394] [D] [StratumClient`1] [0HL9J8E1PPQH9] Last subscriber disconnected from receiver stream
oliverw commented 6 years ago

The last one is not an error. The client closed the connection. Otherwise the last two lines would have been in reversed order.

oliverw commented 6 years ago

@rterbush Ok. I think I've nailed it now. My last commit regarding this issue actually made things much much worse, but not because the logic was flawed but because of an unforeseen side-effect.

This innocently looking statement causes buffer.Count to become zero. If you inspect the remaining code for references to buffer.Count you'll realize how much chaos this would cause. My test scenario for the buffer logic did not suffer from this side-effect because it used a different implementation of buffer to simplify testing. Bummer.

rterbush commented 6 years ago

I will give it a try and report back. I know how difficult this part of a tcp server is to get right. Thanks for persisting and sorry I could not provide more than the debugging.

rterbush commented 6 years ago

@oliverw Initial results are very positive. Server has been able to walk up through diff level starting at 1 to 7 in delta of .5 without any disconnects. It was not able to do that in the past. I think you have nailed it. I will close and hope not to reopen. :-)

Thanks again

oliverw commented 6 years ago

https://www.youtube.com/watch?v=_J6-3l3hCm0 😄

oliverw commented 6 years ago

@rterbush Ironically, there was still some bug lurking in there. I've just fixed it. 89a542d4cb22daa7fe343ddb5a5f02dcbe07bd72

rterbush commented 6 years ago

Thanks for the heads up. Will apply your latest change. Has been rock solid with no disconnects since previous change.