nanocurrency / nano-node

Nano is digital currency. Its ticker is: XNO and its currency symbol is: Ӿ
https://nano.org
BSD 3-Clause "New" or "Revised" License
3.47k stars 785 forks source link

RPC v2 #2467

Open zhyatt opened 4 years ago

zhyatt commented 4 years ago

Background

Original plans for a new RPC refactor built on the IPC 2.0 setup were delayed due to higher priority items, and an overall expansion of scope further prevented progress. To help make improvements with the RPC setup int he short term, the scope for v2 of the RPC has been restricted to the following details, aiming to fix some key problems currently being faced. A v3 will be completed at a later date to fully utilize the new IPC setup.

Approved

Needs evaluation

Potential existing RPC issues of note

These are issues that may need investigation if pulling over code directly from RPC 1.0 implementation.

Newly proposed RPCs or features (lower priority)

Approved

None

Kaspre commented 3 years ago

Request: RPC command to show voting status. Is voting is enabled (yes/no), when did the node last participate in a vote (elapsed time and/or timestamp), how many votes in last X minutes (integer), maybe an optional parameter to show vote details (list of last X blocks that were voted on)

zhyatt commented 3 years ago

Request: RPC command to show voting status. Is voting is enabled (yes/no), when did the node last participate in a vote (elapsed time and/or timestamp), how many votes in last X minutes (integer), maybe an optional parameter to show vote details (list of last X blocks that were voted on)

Can you please elaborate on the use case you have for this command?

Kaspre commented 3 years ago

Can you please elaborate on the use case you have for this command?

Sure! I think it would primarily be useful for troubleshooting when there appears to be an issue with the node. Two recent examples:

  1. To troubleshoot a node that appears to have stopped voting

Recently I noticed that sites like Nanocrawler and NanoNinja were reporting my node as "offline" or "unstable" and showing that it had not voted in many hours. Initial checks of my node all looked good; it was at 100% synch, was cementing blocks, and seemed to be running well. I wasn't sure if it was really not voting, or if a synchronization issue (this was during the spam attack, prior to v21.3) was causing a reporting issue. It was down for almost a full day before I finally identified the cause (my config file somehow got reset and no longer included the required 'enable_voting = true' string).

If I had the requested RPC command, it probably would have been the first thing I checked. I would have immediately seen that there really was no recent vote history, so it wasn't a synchronization/reporting issue, and that voting was not enabled, to point me toward the root cause. So, it might have prevented 20+ hours of downtime for my PR node during a time when we were struggling to maintain a quorum.

  1. Independently verify voting status without relying on third-party sites

A few days ago, I again noticed that Nanocrawler and NanoNinja were reporting my node as unstable/offline with no recent votes received. I spent about an hour checking all of the configuration and health monitoring, and couldn't find any issues, but neither could I confirm that my node was voting. Eventually someone on Discord noticed that all nodes were being reported as unstable/offline; it wasn't just my node. It turned out to be an issue with NanoNinja that was soon resolved, and there never was a problem with my node. But I had already wasted a lot of time and energy chasing the nonexistent problem with my node.

If I had the requested RPC command, it might have shown me very quickly that voting was still enabled, that there were X recent votes, and perhaps if run with an option for vote_details I could see and verify that my node has voted on particular recent block elections.

Maybe these are edge cases and not worth spending a lot of time on, but personally I think something like this would be very useful to node operators to verify that everything is working as intended, or to troubleshoot root cause when it isn't. Hope this helps. Thanks for your time and attention.