fletcher / MultiMarkdown-6

Lightweight markup processor to produce HTML, LaTeX, and more.
https://fletcher.github.io/MultiMarkdown-6/
Other
632 stars 91 forks source link

token_first_child_in_range, token_last_child_in_range, token_child_for_offset limited at the end of the document #256

Open DivineDominion opened 4 months ago

DivineDominion commented 4 months ago

This is another use of the AST as a data source:

Performing tree traversal, I noticed that an empty document will produce a DOC_START_TOKEN + BLOCK_EMPTY, both with a range of start = 0 and length = 0.

This defeats tree traversal implementations in token_first_child_in_range, token_last_child_in_range, token_child_for_offset that can't descend into empty tokens.

I've been working with this more intensely the past months and to my surprise found that an offset and sub-range based check work well like so:

  1. Does a character range _R_ include offset _O_? A check for inclusion entails that the range is not empty and the offset falls chiefly inside it: R.start <= O && R.start + R.length > O
  2. Does a character range _Ra_ contain character range _Rb_? A check for subrange overlap will falsely (= my argument) reject the same range iff both are empty. (0..<0) and (0..<0) would not overlap at the moment.

For the subrange check I found that this works fine (again, to my surprise), including equal ranges:

    (Ra.start <= Rb.start
        && Ra.start + Ra.length  >= Rb.start + Rb.length)

Consequences:

fletcher commented 4 months ago

I'm not sure I understand what the problem is here. What are you trying to accomplish?