urbit / docs.urbit.org

https://docs-urbit-org.vercel.app
1 stars 8 forks source link

Nock specification is incorrect for instruction 6 #105

Closed peterkelly closed 2 months ago

peterkelly commented 2 months ago

The Nock specification at both the old docs site and the new one gives an incorrect definition for instruction 6:

*[a 6 b c d]        *[a *[[c d] 0 *[[2 3] 0 *[a 4 4 b]]]]       :: incorrect

Implemented as-is, this will not work; it skips two evaluation steps that are necessary. I discovered this while attempting to write a Nock interpreter based on the spec. A correct definition (derived below) is:

*[a 6 b c d]        *[a *[[c d] [0 *[[2 3] [0 *[a 4 4 b]]]]]]   :: correct

It appears that between Nock 5k and Nock 4k (?) there was an attempt made to simplify this expression which resulted in the error. The specification given in the original whitepaper is as follows:

*[a 6 b c d]        *[a 2 [0 1] 2 [1 c d] [1 0] 2 [1 2 3] [1 0] 4 4 b]

The interpreters listed here all seem to use the whitepaper version.

I arrived at the corrected version above by taking this definition and going through the following steps to simplify it:

Step 0: Original expression from whitepaper, formatted for readability

*[
    a
    2
    [0 1]
    [2 [1 c d] [1 0] 2 [1 2 3] [1 0] 4 4 b]
]

Step 1: Reduce using rule 2:

*[
    *[a 0 1]
    *[a 2 [1 c d] [[1 0] 2 [1 2 3] [1 0] 4 4 b]]
]

Step 2: Reduce first component of cell using instruction 0 (tree index 1, returning subject a), and reduce second component of cell using instruction 2:

*[
    a
    *[
        *[a 1 [c d]]
        *[a [1 0] [2 [1 2 3] [1 0] 4 4 b]]
    ]
]

Step 3: Reduce first component of innermost expression using instruction 1 (constant [c d]), and second component using distribution rule:

*[
    a
    *[
        [c d]
        [
            *[a 1 0]
            *[a 2 [1 2 3] [[1 0] 4 4 b]]
        ]
    ]
]

Step 4: Reduce first component of innermost expression using instruction 1 (constant 0), and second component using instruction 2:

*[
    a
    *[
        [c d]
        [
            0
            *[
                *[a 1 [2 3]]
                *[a [1 0] [4 4 b]]
            ]
        ]
    ]
]

Step 5: Reduce first component of innermost expression using instruction 1 (constant [2 3]), and second component using distribution rule:

*[
    a
    *[
        [c d]
        [
            0
            *[
                [2 3]
                [
                    *[a 1 0]
                    *[a 4 4 b]
                ]
            ]
        ]
    ]
]

Step 6: Reduce first component of innermost expression using instruction 1 (constant 0):

*[
    a
    *[
        [c d]
        [
            0
            *[
                [2 3]
                [
                    0
                    *[a 4 4 b]
                ]
            ]
        ]
    ]
]

And that's it. Condensed to a single line, this is:

*[a *[[c d] [0 *[[2 3] [0 *[a 4 4 b]]]]]]

Comparing the two pretty-printed versions side-by-side, we can see where the two missing evaluation steps are:

Correct version                          Incorrect version
(derived above)                          (given in spec on website)

*[                                       *[
    a                                        a
    *[                                       *[
        [c d]                                    [c d]
        [                                
            0                                        0
            *[                                       *[
                [2 3]                                    [2 3]
                [                        
                    0                                        0
                    *[a 4 4 b]                               *[a 4 4 b]
                ]                        
            ]                                        ]
        ]                                
    ]                                        ]
]                                        ]
peterkelly commented 2 months ago

Actually it's just occurred to me that this could be a misreading of the expression on my part, due to it being in a pretty-printed form that excludes extra square brackets in the tail of a cell (which a parser would pick up). Is this the case? If so, it might be worthwhile including all the square brackets to avoid ambiguity.

I've also realised that what I described as "evaluation steps" above are actually cons operations (these are what I initially missed in my implementation based on my reading of the spec; the number of * instances is the same).

peterkelly commented 2 months ago

It looks like I was right in my previous comment; the original expression does appear to be correct, and having re-implemented the operation now after understanding what was going on it works correctly. Closing this issue as it doesn't seem to be a problem after all.