mmosko / icn-flic-manifest

FLIC Manifest RFC draft
0 stars 0 forks source link

Comments from Ken Calvert #1

Open daveoran opened 3 years ago

daveoran commented 3 years ago

I have reviewed draft-irtf-icnrg-flic-02. Overall it is a very good and useful document.

Context: I came to the document with general knowledge of CCN/NDN and understanding of the manifest concept, but without knowledge of the details of any proposed implementation. I approached it as if I were going to implement or use the protocol from scratch. There may be relevant drafts I haven't read, which would affect my understanding.

Two parts to this: (I) high-level suggestions on presentation, and (II) corrections and nits.

I. High-level/global suggestions


  1. Define a consistent terminology early, giving the meanings in terms of BOTH CCNx and NDN standard terminology, and then use that throughout. Currently some terms are defined only in CCNx terms, while others need to be more precisely defined in standard terms. I guess what I'm saying is the "Terminology" section 3.1 did not help me much. Example: the relationship among Links, Locators, Namespaces, Names, and Prefixes all need to be made clear. Some of this may be due to this draft not yet reflecting the RFC terminology. (I'd volunteer to help with that, as it would help me understand the latter better.)

  2. The inode analogy is useful in the document's abstract and introduction for giving some intuition, but it may not be helpful beyond that. The term "pointer", for example, is a basic computer science concept, and could be defined without citing inodes. In some places (Section 4 and Appendix A) the inode model does seem most appropriate, while in others (Section 3), trees seem like the more useful paradigm, but because I was expecting inode-like structure, it was confusing.

  3. Section 2 is more like a list of features than a set of "Design Goals". Stating design goals is fine, but they should be formulated as such, rather than as selling points of the protocol. (This would also help reviewers of the specification.)

  4. Move the use cases (Sections 4.1, 4.3-4.5) earlier in the document. The notation used is self-explanatory, and IMO illustrates the utility and flexibility of the design better than what comes before. For example, the earlier sections give the impression that it is necessary (or at least usual) to have BOTH Root and Top manifests (i.e., two levels of indirection); the use cases show that this isn't required, but also how it can be useful.

  5. Putting the section containing the grammar earlier in Section 3 might make the rest of that section easier to understand. But I'm not sure - see also #0 above.

  6. I think "preferred" should be avoided in a specification. Better to state what an approach or technique is good for (and why!), and let the reader decide whether it applies to any particular case (see also #3 above). I also suggest avoiding "typical" unless the scope is clear.

    II. Corrections/nits


  7. Introduction

    7th paragraph - the sentence "A FLIC manifest encodes locators the same for both ICN protocols, though they are encoded differently in the underlying protocol." is self-contradictory. Suspect the first "encodes" should be something else. Is this sentence necessary?

    3.1 Terminology

    Internal Manifest - last sentence, s/direct manifests/direct pointers/ .

    3.3 Namespaces

    "If using namespaces, typically there are two defined: one for the manifest namespace and one for the application data namespace. If the two are the same, they can share a namespace." This seems to be a tautology, although perhaps it's just semantic ambiguity about what "the two" refers to.

    3.4 Manifest Metadata

    2nd paragraph seems out of place - it's completely unclear what it has to do with the first paragraph, or with the section header.

    3.7.1 Traversal

    "The algorithms in Figure 2 show the in-order forward traversal code ..." Actually the code shows preorder [sic] traversal; "in-order" is a different thing.

    Next-to-last line of reverse_preorder should be indented; the last line should not be indented.

    3.8 - I did not go over this carefully.

    3.9.1 and 3.9.2 - both contain, in places, sentences like: "The Root Manifest content object has a name used to fetch the manifest....It has a set of Locators used to fetch the remainder of the manifest." This makes it seem like double-indirection is required. Should there be an "if any" or similar in there somewhere?

    3.10 Figure 9 - Including this figure at this point is not very useful - the reader understands indirection by this point, and it just raises the question why you would want to require three-deep manifest retrievals and 55% overhead to get to the actual data.

    4.2 Seeking

    "...how to compute the byte offset of the data pointed at by pointer P_i, offset_i. In this formula, let P_i represent the Size value of the i-th pointer."

    I think you mean to say "...by pointer P_i, call it offset_i" in the first sentence, then "let P_i.size represent..." in the second sentence.

    In the formula "offseti = \sum{i=1}^{i-1} Pi.size", RHS should be \sum{k=1}^{i-1} P_k.size.

    The code after the second paragraph is also wrong, the condition should be

    if (P < offset + P_i.size) ...

    Also the code indentation is a little off.

    Paragraph below Figure 13 is garbled, the P_i in the last sentence should be P_N, and the formula should be:

    D - ((N-1) * L)

    Then the sentence after that basically says the same thing again. The algorithm seems right though.

    4.5 Re-publishing a FLIC under a new name.

    Second bullet - "would like a local result to appear." - not clear what this means. Apparently here prefixes (/beta, /gamma) are being used as names of "sites". One gets the idea that there is some common understanding that some parts of the namespace are tied to topology. If that's the case, can some documentation be cited?

    Same paragraph, "maniest" -> manifest.

    Appendix A - code for building trees - I think interior_direct() is not indented properly, should be at the same level as interior_indirect().

    Also, the statement in interior_indirect(): "reserve_count = min(m, segment.tail - segment.head)" - I suspect that should be

    reserve_count = min(k, segment.tail - segment.head)

    When I worked an example with 100 data blocks, k=20 and m = 5, I got a top manifest with 5 direct pointers and 4 indirect pointers, the latter pointing to 2nd-level manifests with 20, 25, 25, and 25 direct pointers, respectively. The 5 direct pointers (when there is room for 20) seems to contradict the comment that "we want the top of the tree packed". With the suggested change, I end up with 20 direct pointers and 4 indirect in the top manifest, with the 2nd-level manifests containing 5, 25, 25 and 25 direct pointers.

    Ken


    icnrg mailing list

    icnrg@irtf.org

    https://www.irtf.org/mailman/listinfo/icnrg

daveoran commented 3 years ago

Responses to Ken Calvert comments with changes for -03 version

On 3 May 2020, at 18:29, Ken Calvert wrote:

I have reviewed draft-irtf-icnrg-flic-02. Overall it is a very good and useful document.

Context: I came to the document with general knowledge of CCN/NDN and understanding of the manifest concept, but without knowledge of the details of any proposed implementation. I approached it as if I were going to implement or use the protocol from scratch. There may be relevant drafts I haven't read, which would affect my understanding.

Two parts to this: (I) high-level suggestions on presentation, and (II) corrections and nits.

I. High-level/global suggestions

  1. Define a consistent terminology early, giving the meanings in terms of BOTH CCNx and NDN standard terminology, and then use that throughout. Currently some terms are defined only in CCNx terms, while others need to be more precisely defined in standard terms. I guess what I'm saying is the "Terminology" section 3.1 did not help me much. Example: the relationship among Links, Locators, Namespaces, Names, and Prefixes all need to be made clear. Some of this may be due to this draft not yet reflecting the RFC terminology. (I'd volunteer to help with that, as it would help me understand the latter better.)

I believe this is substantially cleaned up for -03, and refers to the RFC version of the terminology document. We also moved the information about terminology usage earlier in the introduction, with a forward reference to the FLIC-specific terminology defined in the design section.

  1. The inode analogy is useful in the document's abstract and introduction for giving some intuition, but it may not be helpful beyond that. The term "pointer", for example, is a basic computer science concept, and could be defined without citing inodes. In some places (Section 4 and Appendix A) the inode model does seem most appropriate, while in others (Section 3), trees seem like the more useful paradigm, but because I was expecting inode-like structure, it was confusing.

The inode analogy has ben toned down a lot in -03. We still use “pointer” as the general term to describe the things that point to data objects and sub-manifests.

  1. Section 2 is more like a list of features than a set of "Design Goals". Stating design goals is fine, but they should be formulated as such, rather than as selling points of the protocol.
    (This would also help reviewers of the specification.)

Right. We changed the title to “Design Overview” and did some rewording to match.

  1. Move the use cases (Sections 4.1, 4.3-4.5) earlier in the document. The notation used is self-explanatory, and IMO illustrates the utility and flexibility of the design better than what comes before. For example, the earlier sections give the impression that it is necessary (or at least usual) to have BOTH Root and Top manifests (i.e., two levels of indirection); the use cases show that this isn't required, but also how it can be useful.

I decided not to move these for now, as they contain lots of details that are explained in the earlier sections. Putting them first would wind up with a lot of forward references, and possibly confusion. One possibility to where some feature is explained in the specification, if it relates directly to a use can example, put a forward reference to that use case so people know to look there (although we haven’t done that yet).

  1. Putting the section containing the grammar earlier in Section 3 might make the rest of that section easier to understand. But I'm not sure - see also #0 above.

The hope is that people can read the descriptions with references to the grammar without having to actually look at the grammar. We didn’t move it.

  1. I think "preferred" should be avoided in a specification. Better to state what an approach or technique is good for (and why!), and let the reader decide whether it applies to any particular case (see also #3 above). I also suggest avoiding "typical" unless the scope is clear. “Preferred” is gone. No use of the word “typical anywhere either.

II. Corrections/nits

  1. Introduction 7th paragraph - the sentence "A FLIC manifest encodes locators the same for both ICN protocols, though they are encoded differently in the underlying protocol." is self-contradictory. Suspect the first "encodes" should be something else. Is this sentence necessary?

Fixed. It now says “A FLIC manifest represents locators in the same way for both ICN protocols, though they are encoded differently in the underlying protocol.“

3.1 Terminology Internal Manifest - last sentence, s/direct manifests/direct pointers/ .

Fixed.

3.3 Namespaces "If using namespaces, typically there are two defined: one for the manifest namespace and one for the application data namespace. If the two are the same, they can share a namespace." This seems to be a tautology, although perhaps it's just semantic ambiguity about what "the two" refers to.

This has been completely rewritten, as we now talk about “Name Constructors” rather than namespaces. Hopefully this is much improved in both content and presentation.

3.4 Manifest Metadata 2nd paragraph seems out of place - it's completely unclear what it has to do with the first paragraph, or with the section header.

Did a bit of rewriting to better connect the text with the overall topic of manifest metadata

3.7.1 Traversal "The algorithms in Figure 2 show the in-order forward traversal code ..." Actually the code shows preorder [sic] traversal; "in-order" is a different thing.

fixed.

Next-to-last line of reverse_preorder should be indented; the last line should not be indented.

fixed.

3.8 - I did not go over this carefully.

3.9.1 and 3.9.2 - both contain, in places, sentences like: "The Root Manifest content object has a name used to fetch the manifest....It has a set of Locators used to fetch the remainder of the manifest." This makes it seem like double-indirection is required. Should there be an "if any" or similar in there somewhere?

well, if the root and top manifests are separate content objects, then there is at least that level of indirection.

3.10 Figure 9 - Including this figure at this point is not very useful - the reader understands indirection by this point, and it just raises the question why you would want to require three-deep manifest retrievals and 55% overhead to get to the actual data.

Good question…I think some discussion making clear everything can be collapsed into a single content object for the case where the underlying set of pointer fit, and indirection through multiple names does not add any value. ***TODO: discussion about collapsing manifests.

4.2 Seeking

"...how to compute the byte offset of the data pointed at by pointer P_i, offset_i. In this formula, let P_i represent the Size value of the i-th pointer." I think you mean to say "...by pointer P_i, call it offset_i" in the first sentence, then "let P_i.size represent..." in the second sentence.

Got it - yes - also set off the variables in fixed-width font.

In the formula "offseti = \sum{i=1}^{i-1} Pi.size", RHS should be \sum{k=1}^{i-1} P_k.size.

Yup.

The code after the second paragraph is also wrong, the condition should be if (P < offset + P_i.size) ... Also the code indentation is a little off.

Fixed.

Paragraph below Figure 13 is garbled, the P_i in the last sentence should be P_N, and the formula should be:

D - ((N-1) * L)

Got it.

Then the sentence after that basically says the same thing again. The algorithm seems right though.

4.5 Re-publishing a FLIC under a new name.

Second bullet - "would like a local result to appear." - not clear what this means. Apparently here prefixes (/beta, /gamma) are being used as names of "sites". One gets the idea that there is some common understanding that some parts of the namespace are tied to topology. If that's the case, can some documentation be cited?

Did some rewording to make this clearer.

Same paragraph, "maniest" -> manifest.

Fixed.

Appendix A - code for building trees - I think interior_direct() is not indented properly, should be at the same level as interior_indirect().

Hmmm, looks properly indented in the current text.

Also, the statement in interior_indirect(): "reserve_count = min(m, segment.tail - segment.head)" - I suspect that should be

reserve_count = min(k, segment.tail - segment.head)

Yup

When I worked an example with 100 data blocks, k=20 and m = 5, I got a top manifest with 5 direct pointers and 4 indirect pointers, the latter pointing to 2nd-level manifests with 20, 25, 25, and 25 direct pointers, respectively. The 5 direct pointers (when there is room for 20) seems to contradict the comment that "we want the top of the tree packed". With the suggested change, I end up with 20 direct pointers and 4 indirect in the top manifest, with the 2nd-level manifests containing 5, 25, 25 and 25 direct pointers.

TODO: Need to double-check this.

Ken