KomodoPlatform / dPoW

Komodo delayed Proof of Work software
36 stars 73 forks source link

Update 0.7.15 #561

Closed smk762 closed 1 month ago

smk762 commented 1 month ago
smk762 commented 1 month ago

cc: @asherda @miketout Apologies guys, but we did not have time to review and implement the latest updates, so I've disabled VRSC from dPoW for now.

cipig commented 1 month ago

doesn't start notarizing VOTE2024, the files like coins/vote2024_7776 are missing

DeckerSU commented 1 month ago

May be we also should remove KIPxxxx chains in this PR?

miketout commented 1 month ago

cc: @Asherda @miketout Apologies guys, but we did not have time to review and implement the latest updates, so I've disabled VRSC from dPoW for now.

Thanks for the apology. This is the first notary communication we've seen since over a week ago and comes the 2nd day after announcing the fork and notifying more than one KMD notary directly during the event. Perhaps there was some communication we missed?

We are over a week after the submission, after the upgrade date announced as the reason for the submission, and after running KMD notaries forked the chain, even though we had a protocol that prevented more widespread damage. Is there a reason you were unable to let a chain being notarized know that you were unable to review or upgrade until after a protocol upgrade date? We could have prevented any event, even postponed the upgrade date had we been notified you would be unable to upgrade with a weeks notice.

Is there a reason we (or others) should believe this would not happen in the future? I'd imagine less robust protocols would end up in worse shape than Verus, which was able to allow most of the network to dynamically fork away from KMD notaries when it recognized that there was a conflict.

smk762 commented 1 month ago

Part of the dPoW update process requires security code review for third party chains. Unfortunately there was insufficient time to complete this review before the update took effect, so we asked notary ops to stop their VRSC daemons and removed it from dPoW to avoid any negative effects until resolved.

I'm aware (as I'm sure you are) that this situation has been ongoing, and frankly, our QA/Sec review process schedule is often less rapid than your release schedule, and rarely do we have a "quiet moment" or gap in ongoing or time sensitive internal workloads which would grant the liberty to drop everything and prioritise the review over other works. This is especially true at this time of year while the notary elections and KIP votes are in progress, which is an especially busy time of year for many of our team.

Generally when we schedule our hardforks, we factor in 4-6 weeks to contact 3rd parties such as CEX, mining pools etc. and follow up to confirm they have updated prior to the date announced across all our social media, blog, newsletters and podcasts. This may seem an overabundance of caution, but it ensures we have enough time to make sure 3rd parties are aware and prepared prior to the hardfork date, and also allows some time to assist 3rd parties with any technical queries etc if issues arise.

We've had some close calls in the past, and more than once its been a mad rush at the last minute, I'm sure we've asked for more lead time / advance warning on more than one occasion, though we do appreciate that for the more recent releases that @asherda began to add issues to this repo to grant us awareness of upcoming changes, even if the timeline of the update is tighter than would be optimal from our side. Unfortunately the recent release lacked the usual bold mandatory tag which may have indicated the urgency (though likely with 1 week warning could still be less time than required for our reviews)

Is there a reason we (or others) should believe this would not happen in the future?

Honestly, no. Not unless more advance warning was provided, and communications from both sides was improved to ensure awareness and implementation with time to spare.

If I understand correctly from prior comms, VRSC is capable of securing itself without dPoW, and has no desire to slow down to a more comfortable schedule. As we come to the end of notary node season 7, it seems to be a convenient time to remove VRSC from dPoW, until such time that it is stable enough to not anticipate such a frequent update cycle.

We respect the work VRSC does, even if our dev/review philosophies may be incompatible. The highest priority here is the security of both networks, and though you may have outgrown dPoW, ideally this would be an amicable, mutual decision which does not preclude future cooperative efforts towards communal prosperity.

I'd imagine less robust protocols would end up in worse shape than Verus

It is not my place to speak on the theoretical "robustness" of other protocols, though we've not had any problems with other 3rd party chains, a majority of which generally only update once a year, and with sufficient advance notice for review and implementation to be scheduled comfortably. VRSC is a unique case, and it is not our place to tell you to slow down more than you are willing to, only to make clear the constraints applicable to dPoW update timelines.

DeckerSU commented 1 month ago

Wording in English is not my strong suit, but I'll try to add my two cents as a “third-party observer.” I’m not sure if there are any private channels for communication between the Komodo Team and the Verus Foundation, but based on what I’ve seen in publicly available channels, @ca333 asked @Asherda about the future of VRSC’s dPoW on April 10th, and this question still hasn’t been answered. So, the claim about "first communication a week ago" doesn't seem accurate.

Regarding "notifying more than one KMD notary directly," fortunately, it doesn't work that way. Updates for coins or any other software for notary nodes follow a certain protocol, and notaries must strictly adhere to the update instructions in the dPoW repo. I know this because, besides everything else, I am a notary node operator. If someone contacts me with a request like “you should update %daemon_name%,” the best thing I can do is ignore it. Every update must pass several key points, one of which is a security review, before being published in the dPoW repo as update instructions. Until all these points are passed and approved by Komodo Team members, notary node operators are not allowed to act on their own. They must follow the protocol.

Moreover, recently, VRSC has had several updates (like -1, -2, -3, etc.). Some of these were mentioned by issues:

opened by Asherda in the repo, but none of these issues provided information about the priority, whether the update contained hard-forking changes, activation heights, etc. So, basically, nobody could know in advance if the next update would contain something important for the Verus network. All the issues contain are Commit and Release notes.

How are people supposed to guess the severity of these releases, activation heights, or protocol upgrade dates?

Of course, such an approach, where one side doesn’t provide key information needed for the proper functioning of the protocol, combined with overall miscommunication, will lead to bad consequences. I am not surprised at all.

miketout commented 1 month ago

Part of the dPoW update process requires security code review for third party chains. Unfortunately there was insufficient time to complete this review before the update took effect, so we asked notary ops to stop their VRSC daemons and removed it from dPoW to avoid any negative effects until resolved.

If that was done with less than a week delay or if someone would have communicated to us at all saying that you needed any amount of more time, there would have been no event. Are you saying that notaries were notified earlier than the week of notice we gave and asked to stop running their nodes? Notaries were running old nodes when the upgrade activated and that is what caused the event.

Honestly, no.

Then it seems that it is more of a liability and security risk than any benefit to use dPoW, so I think removing Verus from this system is best for all involved.

Regarding @DeckerSU 's question... As one of the people I pinged during the event who seems to have answered only days later with this response, I'm not sure the role is "3rd party observer".

How are people supposed to guess the severity of these releases, activation heights, or protocol upgrade dates?

No one needs to guess, each post has a link to all information, including reason for upgrade and importance. If people do a whole security review each time to make sure no one is attacking notary nodes with malicious code, which hasn't ever happened w/Verus in ~6 years, I guess it would be easy to click the link provided each time that explains the importance of the upgrade. Asher has also made every effort not to ask the notaries to upgrade unless it was important that they do so as a response to the difficulties we had previously. Asher did not respond because the question about "the future of VRSC’s dPoW" seems as much a question for notaries as anyone. Verus is not run by Asher or me, it is decentralized, and while we believed dPoW was an extra, redundant layer of security, we believed that would be up to notaries to decide if they would continue, not Asher. Deciding and letting us know would have been much better than waiting until an upgrade, not communicating, and only asking people to stop notarizing after the network survived the event.

DeckerSU commented 1 month ago

No one needs to guess, each post has a link to all information, including reason for upgrade and importance.

image

This post was written on May 10th. The release link states:

UPDATE MAINNET NODES BY:

VERUS MAINNET BEFORE BLOCK 3050000
vARRR MAINNET BEFORE vARRR BLOCK 67000, SAME AS PRIOR RELEASE
NEW PBaaS CHAINS SHOULD LAUNCH USING v1.2.2-5 OR LATER
CRITICAL UPDATE ASAP FOR ALL VALIDATORS

Does it say that the update is critical for notary nodes? No. Maybe "validators" also means notary nodes, but:

The VRSC chain produces around 1,400 blocks per day. The last block on May 10th was #3042274. Let's count the blocks until the date X: 3050000 - 3042274 = 7726 blocks. 7726 / 1400 = about 5 days. So, literally all the steps to analyze, approve the changes, and distribute the update would need to be done in 5 days without any pre-approvals or other preparations? I believe such announcements should be made in advance, not just 5 days beforehand.

Let's take this other update request. It was made on July 10th and states that the deadline is July 12th, 17:00 UTC. A critical update with a 2-day notice? Do you need other examples?

I'm not here to fight or argue. I just want to mention that there probably was some miscommunication and miscoordination. In my opinion, 2-5 days is not enough time to roll out an update on notaries. This is just my personal opinion, nothing more.

ca333 commented 1 month ago

cc: @Asherda @miketout Apologies guys, but we did not have time to review and implement the latest updates, so I've disabled VRSC from dPoW for now.

Thanks for the apology. This is the first notary communication we've seen since over a week ago and comes the 2nd day after announcing the fork and notifying more than one KMD notary directly during the event. Perhaps there was some communication we missed?

We are over a week after the submission, after the upgrade date announced as the reason for the submission, and after running KMD notaries forked the chain, even though we had a protocol that prevented more widespread damage. Is there a reason you were unable to let a chain being notarized know that you were unable to review or upgrade until after a protocol upgrade date? We could have prevented any event, even postponed the upgrade date had we been notified you would be unable to upgrade with a weeks notice.

Is there a reason we (or others) should believe this would not happen in the future? I'd imagine less robust protocols would end up in worse shape than Verus, which was able to allow most of the network to dynamically fork away from KMD notaries when it recognized that there was a conflict.

Part of the dPoW update process requires security code review for third party chains. Unfortunately there was insufficient time to complete this review before the update took effect, so we asked notary ops to stop their VRSC daemons and removed it from dPoW to avoid any negative effects until resolved.

If that was done with less than a week delay or if someone would have communicated to us at all saying that you needed any amount of more time, there would have been no event. Are you saying that notaries were notified earlier than the week of notice we gave and asked to stop running their nodes? Notaries were running old nodes when the upgrade activated and that is what caused the event.

Honestly, no.

Then it seems that it is more of a liability and security risk than any benefit to use dPoW, so I think removing Verus from this system is best for all involved.

Regarding @DeckerSU 's question... As one of the people I pinged during the event who seems to have answered only days later with this response, I'm not sure the role is "3rd party observer".

How are people supposed to guess the severity of these releases, activation heights, or protocol upgrade dates?

No one needs to guess, each post has a link to all information, including reason for upgrade and importance. If people do a whole security review each time to make sure no one is attacking notary nodes with malicious code, which hasn't ever happened w/Verus in ~6 years, I guess it would be easy to click the link provided each time that explains the importance of the upgrade. Asher has also made every effort not to ask the notaries to upgrade unless it was important that they do so as a response to the difficulties we had previously. Asher did not respond because the question about "the future of VRSC’s dPoW" seems as much a question for notaries as anyone. Verus is not run by Asher or me, it is decentralized, and while we believed dPoW was an extra, redundant layer of security, we believed that would be up to notaries to decide if they would continue, not Asher. Deciding and letting us know would have been much better than waiting until an upgrade, not communicating, and only asking people to stop notarizing after the network survived the event.

Thank you for your response. We appreciate your acknowledgment of the situation and your feedback. Our primary responsibility is to ensure the security and stability of the dPoW ecosystem. Given the circumstances, (temporarily) disabling VRSC from dPoW was a necessary step to protect all involved parties.

Part of the dPoW update process requires a thorough security assessment and precise secure code review for third-party chains. Unfortunately, there was insufficient time to complete this review before the update took effect and by the time we completed the review we noticed that another one (or a few) has been rolled out. To prevent any potential negative effects, we instructed notary operations to stop their VRSC daemons and removed it from dPoW until the situation could be analysed.

Our dPoW update policy, as clearly outlined, requires a minimum of 4 weeks notice for mandatory updates, and at least 8 weeks for significant code changes. This policy ensures thorough reviews and smooth transitions, safeguarding all projects. The updates to the Verus chain, however, have been frequent and often rolled out with minimal notice - not following our policy nor a robust rollout-procedure. For instance, Verus had over 21 dPoW update requests last year (!) - an unsustainable frequency that posed significant challenges for our review process and resource allocation.

In contrast, most robust dPoW protected protocols typically require only one update per year, which often includes the dPoW season code update.

Despite our best efforts to accommodate these VRSC updates, including conducting multiple security assessments free of charge, the rapid and continuous changes have proven to be a liability and security risk. With all due respect to your efforts, the nature of these frequent updates, often rolled out without proper pre-production testing / quality assurance procedure, underscores the instability of the Verus protocol.

It’s important to note that our decision to disable VRSC from dPoW wasn't taken lightly and was primarily driven by the need to protect the integrity of the dPoW network and its users. Verus has also clarified that it does not require dPoW protection, which we wouldn't otherwise consider removing if it was in need of such security measures. However, a protocol that rolls out updates in a "code in production" style significantly increases the risk to our network. While (we consider) Verus is currently in an experimental/PoC stage, our priority is to maintain a secure and reliable/stable environment.

Moving forward, we believe that removing VRSC from dPoW is in the best interest of both our networks. We remain open to reconsidering VRSC for potential re-activation once it moves beyond the experimental stage and demonstrates the necessary stability and security.

We also acknowledge the need for better communication in case of such update frequencies and you following our update guidelines / policy (specifically timelines) to avoid misunderstandings or disruptions in the future. I also want to highlight that we value your feedback and will take steps to improve the processes.

Thank you for your understanding and cooperation.

Best regards, Kadan Stadelmann (CTO)

sources / references:

https://github.com/KomodoPlatform/dPoW/issues?q=is%3Aissue+is%3Aclosed https://github.com/VerusCoin/VerusCoin/releases https://github.com/VerusCoin/VerusCoin/pulls?q=is%3Apr+is%3Aclosed https://github.com/KomodoPlatform/dPoW/tree/master?tab=readme-ov-file#dpow-assets-update-requirements https://github.com/KomodoPlatform/dPoW/issues/560

miketout commented 1 month ago

The only security event that happened over the past 6 years that relate to notaries or Verus impacting the notary system was forking of the Verus network due to communication failures. We had changed our communication at your request to this form, which you said would allow you to always at least respond within 24 hours. It is completely reasonable that if we follow the protocol you have told us to follow, which we did, to expect a response of some form within 6 or 7 days, or even preventing KMD notary nodes from forking the chain by stopping notary nodes before the upgrade that we communicated as requested.

You actually did not stop the notary nodes at all, because after you did not do so and instead forked the chain, the community issued an oracle notification on the notary fork and stopped all notary nodes that way, so they would not cause more harm. That we are now filling a wall of text to obscure that simple fact, instead of just a simple extension of an apology for forking the chain is a clear indication that there are issues on your end that would need to be fixed before it would be safe to use KMD notary services again.

As far as the notice time and @DeckerSU 's math, the public announcement was made coincident with block 3039880, the upgrade was 3050000. That is, in fact slightly more than 7 days, regardless of your time zone. That Asher posted on this Github the next day was still more than 6 days. By claiming the time should start at the end of the day in your time zone, perhaps you can get to 5, but it was 6, and to then make the argument that "2-5" is too short, is out of touch with the reality of the situation.

It seems that rather than a simple acknowledgement of an error, at a minimum in communication from notaries, that resulted in forking of the chain, you would prefer to blame, misrepresent the notice given, belabor history that we have chosen not to do, and claim best efforts. If not responding at all in 6 or 7 days, not stopping notaries until they were stopped by the community after a chain fork, and not taking any responsibility is what you see as "best efforts", then the path forward is clear.