pokt-network / pocket

Official implementation of the Pocket Network Protocol v1
https://pokt.network
MIT License
63 stars 33 forks source link

[Documentation] V1 Pocket Specification Visualizations #117

Closed Olshansk closed 1 year ago

Olshansk commented 2 years ago

Objective

Visualize the Pocket Network V1 Specifications to *demisitify its secrets* so it can be understood by everyone.

Origin Document

A summary of the V1 specs is accessible here. Though the specifications are still undergoing revisions over time, it is vital to make sure they are easy to understand.

This can be done through iterations of the text, additions of various diagrams, as well as dynamic interactive visuals.

Examples

A great example of this for the Raft consensus protocol is thesecretlivesofdata.com by @benbjohnson. Here is a small gif of the beginning:

raft2

As a point of reference, we started using remotion.dev in pocket-videos, but it is still just an idea and requires a dedicated react engineer who is interested to learn about the Pocket Protocol:

raintree

Deliverables

Goals

Non-goals


Creator: @Olshansk Co-Owners: ???

Olshansk commented 1 year ago

@jeremybeal11 Wanted to confirm if you're open to taking this task on?

jeremybeal11 commented 1 year ago

@Olshansk I am. I am working on the RainTree video now. I have some open questions in the Pocket Discussion that I'm trying to clarify.

jeremybeal11 commented 1 year ago

Finished the first iteration of Raintree using Figma. The presentation can be found here

Reason: The goal for choosing this tool is to allow anyone(dev or non-dev) to create and edit the visual without a high barrier to entry half the time.

Editing and forking: To fork and make edits, you can go directly to the community project here. Anyone can look at the README and the project and edit the file.

@Olshansk, I will apply your edits later tonight or tomorrow.

This doesn't show the grey area of what happens when the left and right branch calculations are <.05 when there are one or two nodes left. The current formula or spec doesn't show how it resolves the last node(s).

CC: @benvangithub

jeremybeal11 commented 1 year ago

@Olshansk the comments in the feedback you left in notion you said:

We’ve omitted the actual message we’re trying to broadcast

Q: are we getting rid of the gossip, adjust, and resend messages? if so, what are the new messages or it will all be the same msg?

Olshansk commented 1 year ago

Reason: The goal for choosing this tool is to allow anyone(dev or non-dev) to create and edit the visual without a high barrier to entry half the time.

Editing and forking: To fork and make edits, you can go directly to the community project here. Anyone can look at the README and the project and edit the file.

The primary maintainers of the library & algorithm are going to be developers, and non-dev are primarily going to be consumers of this. The reason it's important today is because the algorithm, terminology, etc are going to be undertaking a lot of changes, so we need a foundation that it's easy to maintain, update, review and adapt.

Similar to how we can have Infrastructure As Code, or Diagrams As Code, I was hoping we can "Visualization As Code", which is why I suggested using http://remotion.dev.

For example, see the documentation for the state hash: https://github.com/pokt-network/pocket/blob/main/shared/docs/PROTOCOL_STATE_HASH.md. The diagrams live directly in the documentation, and we can review them (with comments, discussions, git history, etc..) directly in place, rather than leaving comments externally. The fact that there’s no easy way to “suggest edit”, but rather just leave a very long list of comments (see below) is a signal that this might not be maintanble or automatable.

I realize that this is a lot more work, but I'd prefer we go with this approach for future work because I felt like there was a lot of friction to try and edit and provide feedback on this.

The primary goal isn’t just a visualization we can just show, but a visualization we can maintain.


Part 1 - High Level Overview

  1. Page 1
    1. Why is the spacing different between This is how the RT looks like… and We name branches from the OG… in the first two slides?
    2. Can replace the acronym RT with RainTree. IMO, there is enough screen space for RainTree, but if you do go with the acronym, we should define it. For example, The RainTree (RT) is a …
  2. Page 2
    1. We are using & referencing the acronym OG without defining that it stands for Originator Node. We should at least define it before using the acronym
  3. Page 3
    1. Broadcast Message: The actual data message being gossiped throughout the network.
    2. ACK: An ACKnowlegement that the message was received. Similar to a TCP ACK, but at the application layer.
    3. Remove GOSSIP from GOSSIP Adjust/Resend
    4. Replace +1 & -1 with +1 & -1 in the address list
  4. Page 4
    1. The peer list is not given by other nodes. The peer list is a dynamic list composed of the world state of staked protocols along with those continously discovered and churning during the P2P lifecycle.
    2. Please specify that the address book is also sorted by the node’s address.

Part 2 - Joining the Network

  1. NIT: All the other sections use a - between the Part X and the title, but this uses a :. Please fix.
  2. Page with subtitle Now that it has the list…
    1. I believe the There are no ACK requests part is a bit confusing, we can remove this altogether.
    2. We can replace the subtitle with: Once the new node is synched, it uses peers from the world state (i.e. staked actors) and those from its peers (i.e. full unstaked nodes) to build it's own RainTree where it itself is in position 0.
  3. Page starting with Next, the node needs to organize…
    1. NIT: Lowercase letter after the period
    2. General: I think we need to rework these slides a bit. Do you understand the idea from the code of how the list wraps around itself to put self in the beginning? I’m asking to understand this needs an additional explanation as it also impacts the following slide.
  4. Page starting with Next, the node needs to find
    1. The OG is always the top layer regardless of where it is in the overall list.
    2. The log has to have a base of 3 - this is important.

Part 3 - Determining the Left & Right


  1. The text says “10 node example” and the address book has 10 nodes, but the visualization only has 7. I think we should make all of these the same to avoid confusion.
  2. After calling Round, the result should be an integer -> this related to all computations
    1. Ideally, you should show what it looks like rounded and not rounded
  3. Layers always round to integers
  4. When you have an arrow from the right branch, it says left branch in the box and vice versa
  5. Use fractions for 0.666 and 0.333
    1. 2/3 is for the right branch
    2. 1/3 is for the left branch
  6. The left & right targets are incorrect.
    1. Once you fix everything to use integers, it should fix the numbers (currently 2 and 4).
    2. Can you confirm / show what the python simulator? When I run it using numRainTreeNodes=10 make p2p_test_generator I get the following:

Part 4 - Making it Rain

  1. Replace now it… with Now the OG node…
  2. Replace acknowledgement ping with ACK message
  3. Typo in position dfrom -> change to position from
  4. Typo in the animation. It says 2: Yes, I’m here from both nodes 2 and 4. Fix node 4.

Part 5 - Demotion

  1. Try to avoid using it and refers to it either by self, the OG or 0 so it is semantic and avoids confusion to the readers
  2. Didn’t do a deep review of this part, let’s:
    1. Fix the math similar to part 3
    2. Compare what this should be in relation to the output of numRainTreeNodes=10 make p2p_test_generator
Olshansk commented 1 year ago

When I run numRainTreeNodes=10 make p2p_test_generator using the simulator, the targets are 4 & 7:

                                                                                                     val_1                                                                                                                                                                                                          
                                  ┌────────────────────────────────────────────────────────────────────┴───────────────────────────────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────┐                                                                
                                val_4                                                                                                    val_1                                                                                                   val_7                                                              
           ┌──────────────────────┴────────────┬────────────────────────────────┐                                   ┌──────────────────────┴────────────┬────────────────────────────────┐                                  ┌──────────────────────┴────────────┬────────────────────────────────┐                  
         val_6                               val_4                            val_8                               val_3                               val_1                            val_5                              val_9                               val_7                            val_1                
   ┌───────┴────┬─────────┐            ┌───────┴────┬─────────┐         ┌───────┴────┬─────────┐            ┌───────┴────┬─────────┐            ┌───────┴────┬─────────┐         ┌───────┴────┬─────────┐          ┌────────┴────┬─────────┐            ┌───────┴────┬─────────┐         ┌───────┴────┬─────────┐   
 val_7        val_6     val_8        val_5        val_4     val_6     val_9        val_8     val_4        val_4        val_3     val_5        val_2        val_1     val_3     val_6        val_5     val_1      val_10        val_9     val_1        val_8        val_7     val_9     val_2        val_1     val_7 

In the presentation, the targets are 2 & 4:

Screenshot 2022-12-04 at 6 23 38 PM

@jeremybeal11 I believe I might simply be misunderstanding what the presentation is showing relative to the actual simulation so would appreciate your pointers

jeremybeal11 commented 1 year ago

@jeremybeal11 I believe I might simply be misunderstanding what the presentation is showing relative to the actual simulation, so would appreciate your pointers

Yea, I understand, and that's where the discrepancy lies when using the formula in real-time. This is why we showed the formula in action, which makes what is shown in the visual where when you write it out, there is no mistake that what is shown is correct, and there "could" be an issue with the actual simulator itself on how it interprets its values. I mentioned this offline during the offsite we may need to have a sit-down and go over it since there are some gaps that I've already addressed in the spec doc.

cc @Olshansk and @benvangithub

Olshansk commented 1 year ago

@jeremybeal11 In preparation for our call, could you explain how you get targets 2 & 4 for the 10 node case so I can thinking about it ahead of time?

Per the simulator's output (same command as above that generates the tree), here is the list of messages sent:

Msg: [ (val_1), val_2, val_3, **val_4**, val_5, val_6, val_7, val_8, val_9, val_10 ]
Msg: [ (val_1), val_2, val_3, val_4, val_5, val_6, **val_7**, val_8, val_9, val_10 ]
Msg: [ (val_4), val_5, **val_6**, val_7, val_8, val_9 ]
Msg: [ (val_4), val_5, val_6, val_7, **val_8**, val_9 ]
Msg: [ (val_7), val_8, **val_9**, val_10, val_1, val_2 ]
Msg: [ (val_7), val_8, val_9, val_10, **val_1**, val_2 ]
Msg: [ (val_1), val_2, **val_3**, val_4, val_5, val_6 ]
Msg: [ (val_1), val_2, val_3, val_4, **val_5**, val_6 ]
Msg: [ (val_6), **val_7**, val_8, val_9 ]
Msg: [ (val_6), val_7, **val_8**, val_9 ]
Msg: [ (val_8), **val_9**, val_4, val_5 ]
Msg: [ (val_8), val_9, **val_4**, val_5 ]
Msg: [ (val_4), **val_5**, val_6, val_7 ]
Msg: [ (val_4), val_5, **val_6**, val_7 ]
Msg: [ (val_9), **val_10**, val_1, val_2 ]
Msg: [ (val_9), val_10, **val_1**, val_2 ]
Msg: [ (val_1), **val_2**, val_7, val_8 ]
Msg: [ (val_1), val_2, **val_7**, val_8 ]
Msg: [ (val_7), **val_8**, val_9, val_10 ]
Msg: [ (val_7), val_8, **val_9**, val_10 ]
Msg: [ (val_3), **val_4**, val_5, val_6 ]
Msg: [ (val_3), val_4, **val_5**, val_6 ]
Msg: [ (val_5), **val_6**, val_1, val_2 ]
Msg: [ (val_5), val_6, **val_1**, val_2 ]
Msg: [ (val_1), **val_2**, val_3, val_4 ]
Msg: [ (val_1), val_2, **val_3**, val_4 ]
jeremybeal11 commented 1 year ago

@Olshansk ofc.

Using the formula:

1) Toplayer = Round(Log3(fullListSize))
2) current layer = (top layer - 1)
3) targetListSize(TLS) = (topLayer - currentLayer) x 0.666 x Size of full list
4A) right branch = targetListSize/3
4B) left branch = targetlistSize/1.5

I ger the following results:

1) Top layer:
    1a: round(log3(10) = 2.09
2) Current layer = 1.09
3) TLS = 6.6
4) Branch calc
    4a: right branch: 6.6 / 3 = 2.2
    4b: left branch: 6.6 / 1.5 = 4.4

Assuming the Toplater, current layer, TLS is indeed correct. When you use the values outlined in PT 3 of your feedback:

Use fractions for 0.666 and 0.333
2/3 is for the right branch
1/3 is for the left branch

The values are still the same after you round up.

Olshansk commented 1 year ago

@jeremybeal11 The implementation in the source and that in the simulator results in the same thing, so there's a chance that the documentation may simply be wrong.

This is good feedback that the original specs are definitely outdated and the source (how its implemented and works) is this oruce of truth. For the future, the source code should usually act as the absolute source of truth

As I was looking through the visualizations, I just couldn't gather how we'd ever make the message being gossiped propagate through the whole network.


From the source code: https://github.com/pokt-network/pocket/blob/f0400e86846eb73f95df80f7c9d47500576b66ad/p2p/raintree/addrbook_utils.go#L22

...

func (n *rainTreeNetwork) getAddrBookLength(level uint32, _height uint64) int {
    peersManagerStateView := n.peersManager.getNetworkView()
    shrinkageCoefficient := math.Pow(shrinkagePercentage, float64(peersManagerStateView.maxNumLevels-level))
    return int(float64(len(peersManagerStateView.addrList)) * (shrinkageCoefficient))
}

...

From the simulator: https://github.com/pokt-network/rain-tree-sim/blob/main/python/simulator.py

   ...

    t1 = (i + int(n * t1_per)) % n
    t2 = (i + int(n * t2_per)) % n
    s = (i + int(n * s_per)) % n

    t1_addr = addr_book[t1]
    t2_addr = addr_book[t2]

    if t1_addr == t2_addr:
        t2_addr = None
    if t1_addr == addr:
        t1_addr = None

    def send(t: int, t_addr: str) -> None:
        counters.msgs_sent += 1
        t_s = (t + int(n * s_per)) % n
        t_book_s = shrink_list(addr_book.copy(), t, t_s)

   ...

Given the above, we'd change the formulas to this

shrinkagePercentage = 2 / 3
topLayer = ceil(Log3(fullListSize))
currentLayer = topLayer // decrements by 1 each time
targetListSize = round((shrinkagePercentage)^(topLayer - currentLayer) * fullListSize)
leftBranch = ceil(targetListSize * 1 / 3)
rightBranch = ceil(targetListSize * 2 / 3)

Then, for the example of 10 nodes at the top layer:

fullListSize = 10 
shrinkagePercentage = 2 / 3
topLayer = ceil(Log3(fullListSize))
         = ceil(Log3(10))
         = ceil(2.09)
         = 3
currentLayer = topLayer = 3
targetListSize = round((shrinkagePercentage)^(topLayer - currentLayer) * fullListSize)
               = round((2/3)^(3-3)*10)
               = 10
leftBranch = ceil(targetListSize * 1 / 3)
           = ceil(10 * 1 / 3)
           = ceil(3.33)
           = 4
rightBranch = ceil(targetListSize * 2 / 3)
            = ceil(10 * 2 / 3)
            = ceil(6.66)
            = 7

Result: val_1 sends a message to val_4 and val_7 at layer 3:

Screenshot 2022-12-08 at 2 12 43 PM

For the next layer down,

fullListSize = 10 
shrinkagePercentage = 2 / 3
topLayer = ceil(Log3(fullListSize))
         = ceil(Log3(10))
         = ceil(2.09)
         = 3
currentLayer = topLayer - 1 = 2
targetListSize = round((shrinkagePercentage)^(topLayer - currentLayer) * fullListSize)
               = round((2/3)^(3-2)*10)
               = round(6.66)
               = 7
leftBranch = ceil(targetListSize * 1 / 3)
           = ceil(7 * 1 / 3)
           = ceil(2.33)
           = 3
rightBranch = ceil(targetListSize * 2 / 3)
            = ceil(7 * 2 / 3)
            = ceil(4.66)
            = 5

Result: val_1 sends a message to val_3 and val_5 at layer 2:

Screenshot 2022-12-08 at 2 13 10 PM
Olshansk commented 1 year ago

@jeremybeal11 One of the things I was hoping for with the animation is to create the primitives so that as changes arise, or as new ideas come up, it's easy to reuse the building blocks to update the visualization (like refactoring code).

How much effort do you think it is to update these animations? Is it something our team could do?

jeremybeal11 commented 1 year ago

Hey, @Olshansk just reviewed and checked the math and the code you wrote, and it aligns with the results you sent. As you stayed, the docs are outdated and need a heavy facelift to reflect what has already been coded.

I need to take some time to dive into the code to understand how some concepts work in terms of how the address book modifies itself(removes peers from its list after it determines the L & R branches) based on the shrinkagePercentage since that part still isn't clear to me ATM now that I'm relearning some concepts.

How much effort do you think it is to update these animations? Is it something our team could do?

Do you mean to say "our" team or "your"?

In terms of re-building the visual using Remotion it can be done, but it may take some time to do.

Olshansk commented 1 year ago

As you stayed, the docs are outdated and need a heavy facelift to reflect what has already been coded.

Agreed. I'm sorry about this. It's a priority, but unfortunately, a lower priority than some other things.

I need to take some time to dive into the code to understand how some concepts

Sounds good. Treat the source code as the source of truth. Cross reference the golang implementation and the python implementation.

Like I said, I will update these docs at some point, but I can't promise when.

Do you mean to say "our" team or "your"?

I'm okay with either one. Ideally, it should be like "refactoring code" once the primitives are in place amount to 1-2 days of work.

In terms of re-building the visual using Remotion it can be done, but it may take some time to do.

One of the key goals / deliverables in the original issue is to:

Screenshot 2022-12-10 at 11 27 07 AM

Before you rebuild it, let's discuss in our next call next week if it is the right next step.

Olshansk commented 1 year ago

After several attempts, this is no longer a requirement right now.