alephium / gpu-miner

GNU Lesser General Public License v3.0
36 stars 30 forks source link

Raise unsync error everytime when package loss #35

Open BlankerL opened 2 years ago

BlankerL commented 2 years ago

Hi,

My node is hosted on the cloud service, and syncing with 40+ nodes at the same time with external-address set up. However, the Windows and Ubuntu miner will periodically raise unsync errors and the mining process will terminate automatically.

I tried and found there is around 0.1% package loss with the miner and the node, hence leading to the errors mentioned above.

Is it possible to add try-exceptions to escape such errors so that we do not need to restart the miners every time?

polarker commented 2 years ago

Hi,

My node is hosted on the cloud service, and syncing with 40+ nodes at the same time with external-address set up. However, the Windows and Ubuntu miner will periodically raise unsync errors and the mining process will terminate automatically.

I tried and found there is around 0.1% package loss with the miner and the node, hence leading to the errors mentioned above.

Is it possible to add try-exceptions to escape such errors so that we do not need to restart the miners every time?

Thanks for the investigation!

Right now we use this script to auto-restart miner: https://github.com/alephium/gpu-miner/blob/master/run-miner.sh#L30-L34

I will add backoff retry into miner when I have some time !

polarker commented 2 years ago

@BlankerL But TCP would retransmit data if packet loss detected. This should not be a problem as far as I see

BlankerL commented 2 years ago

@BlankerL But TCP would retransmit data if packet loss detected. This should not be a problem as far as I see

I think so, but the fact is that one of my miners got a block and it soon said node unsynced, and I could not find the block on the chain, and also could not get any rewards.

Example happened recently on Windows miner: Mined Block Hash: 0000000000847aa99f1ae9fb467a6483640790eeaf77e4286077af86a0c3ecf6 Screenshot: image

Another Ubuntu rig mining on a local node does not have this problem so far.

BlankerL commented 2 years ago

Right now we use this script to auto-restart miner: https://github.com/alephium/gpu-miner/blob/master/run-miner.sh#L30-L34

Thanks for the information. I am using this script on Ubuntu miner, but not on the Windows one. I would like to contribute a Windows script for auto-restart soon.

However, this script won't solve the problem for the missing block mentioned above.

polarker commented 2 years ago

@BlankerL But TCP would retransmit data if packet loss detected. This should not be a problem as far as I see

I think so, but the fact is that one of my miners got a block and it soon said node unsynced, and I could not find the block on the chain, and also could not get any rewards.

Example happened recently on Windows miner: Mined Block Hash: 0000000000847aa99f1ae9fb467a6483640790eeaf77e4286077af86a0c3ecf6 Screenshot: image

Another Ubuntu rig mining on a local node does not have this problem so far.

I saw people have this problem when the system time is not up to date. Could you please check if that's the case ?

BlankerL commented 2 years ago

I saw people have this problem when the system time is not up to date. Could you please check if that's the case?

Yes, I have also seen the system time issue in Discord, but I checked both the time on the rig and the node are synced, and this "not synced" issue happened periodically.

For my situation, sometimes it worked without any issues for a whole day, and sometimes it worked for only several hours and was unsynced (especially when there is a block mined, which happened at least twice). I have now set up another node in a much closer VPS and the rig is running without any issues yet. If the issue does not happen again, maybe the Internet connection should be the problem.

I will run the node and the rig for some time and keep this issue updated.