`drracket:grouping-position` vs. `{forward backward}-match` methods of `color:text<%>`

racket / drracket

DrRacket, IDE for Racket

http://www.racket-lang.org/

Other

445 stars 93 forks source link

`drracket:grouping-position` vs. `{forward backward}-match` methods of `color:text<%>` #527

Closed greghendershott closed 2 years ago

greghendershott commented 2 years ago

In expeditor @mflatt defined a like-text% object that implements some color:text<%> methods including {forward backward}-match, IIUC to supply to drracket:indentation and/or drracket:range-indentation functions.

I'm trying to understand the relation of these to the 'forward and 'backward flavors of the new drracket:grouping-position.

Are they equivalent, or is there some subtle difference?
In some scenarios would a lang implement one using the other?
For a tool providing expression navigation commands, if a lang supplies drracket:grouping-position, should a tool prefer that (and {forward backward}-match are just a get-info default)?
Is grouping-position for forward/backward merely a preference/convenience --- or, could it be meaningfully different for some lang and a tool should definitely avoid using {forward backward}-match?

Maybe the answers are in the DrRacket and/or expeditor source code and eventually I'll go look, but I wanted to ask the question here since it goes to intent. (Also this might become a FR or PR to add a little explanation/discussion to the docs.)

More context for why I ask:

Emacs has the concept of a forward-sexp-function -- a variable the value of which is a function. The forward-sexp function uses that if non-nil. So ~= a parameter.

(Note: forward-sexp is actually forward/backward in one function; there's a count arg that may be negative. So this is about both directions.)

One wrinkle is that if scanning fails, e.g. you can't move forward anymore, forward-sexp signals an exception with that failed position. And various Emacs commands built using forward-sexp, rely on that -- they might catch that signal and use the position. So this is behavior that a forward-sexp-function needs to emulate.

So I can do the failed-position behavior with something implemented much like {forward backward}-match, scanning tokens --- but could that possibly be wrong if a lang supplies drracket:grouping-position?

mflatt commented 2 years ago

Forward and back expression navigation are meant to be potentially different than just navigating matches. Shrubbery support does something different, for example. So, yes, using that when supplied and falling back to match-based navigation is the intent.

Since drracket:grouping-position is new, we can change the protocol if that helps resolve a mismatch with Emacs.

greghendershott commented 2 years ago

Thanks for the quick reply.

EDIT: See below instead.

~I'll need to work on it more to be sure, but:~

~Maybe an optional fail argument (-> natural? any) that defaults to (lambda _ #f) would preserve the current behavior of just returning #f, while allowing the failing position to be reported in some way the caller wants.~

~However that would change the return contract from (or/c #t #f natural?) to effectively any, which isn't ideal.~

~So I could also imagine redesigning this to return something like:~

(or/c
 (cons/c 'none natural?) ;the old #f enhanced to be a failure position
 'use-s-expression       ;the old #t renamed to be more self-evident
 natural?)               ;a success position

However I need to research the under-documented meaning of "the failure position" wrt Emacs forward-sexp. I'm still reloading my brain with many details from a year ago. I'll report back.

greghendershott commented 2 years ago

Forward and back expression navigation are meant to be potentially different than just navigating matches. Shrubbery support does something different, for example.

If a lang lexer has produced tokens that open or close expressions/blocks, then it sure seems like forward/backward matching using those tokens would give the positions of the expressions/blocks.

So when you have time, could you please say more about what group-positions could say differently than matching, and how/why?

Maybe there's a simple example that would make me (and potentially other doc readers) say, "Oh, of course."

Is it something like, the lang uses off-side rule indenting, but doesn't produce open/close tokens?

I think some off-side rule langs tokenize indent as open/close curly braces (maybe Haskell?), which they also allow. But maybe some langs don't want to, or can't, and grouping-positions is for them?

greghendershott commented 2 years ago

I think I figured out (at least part of) the rationale. I was confused because a year ago I had been thinking of a design where a lang lexer could produce open and close tokens for multi-character keyword lexemes like "begin" and "end", as well as single-char delimiters like parens. However that's not the design here; drracket:paren-matches and color-lexer's pairs argument are all (listof (list/c symbol? symbol?)).

So I guess the story for such langs (using say "begin" and "end") is that forward/backward-match definitely can't/won't do anything interesting; the idea is to use drracket:grouping-position. And I guess same story for an off-side rule lang.

greghendershott commented 2 years ago

To follow up on what grouping-position should do when it can't move forward/backward:

Above I suggested having it return, not just #f, but also the failure position. However:

Emacs forward-sexp (or a forward-sexp-function) is supposed to signal the position that blocked advancing, and the position after it:

If unable to move over a sexp, signal `scan-error' with three arguments: a message, the start of the obstacle (usually a parenthesis or list marker of some kind), and end of the obstacle.

What this seems to be saying (setting aside the message, which is N/A), "please give the bounds of the token where it stopped". So it's two positions.

[[[This is then used by Emacs functions like up-list that call forward-sexp. That's probably N/A as an end user "move up list" command b/c I can use grouping-position up for that. But it might help third party Emacs packages continue to work well.]]]

It seems that shrubbery-grouping-position forward/backward doesn't actually return #f when it can't move --- it returns the same position passed in. That's not what's documented, but, I think it's fine -- maybe even better?

Therefore: Combining these two, my wrapper of drracket:grouping-position can check for it not moving, give the position to get-token-range to get both positions, and return those to the Emacs front end.

So I think the only open action for this issue, is to clarify the "can't move" return value --- is it supposed to be #f (as documented) or the position (as shrubbery-grouping-position seems to do)?

I think I'd prefer the latter. Since it's a brand new spec maybe it would be good to pick/enforce just one.

Otherwise I guess I could defensively code for (or #f (= old new)).

greghendershott commented 2 years ago

I left a comment then deleted it quickly; you may have gotten an email. I confused myself for a second, about a "count" argument that's not actually present in drracket:grouping-position. Please ignore.

Here is what I'm doing to support a forward-sexp-function in Emacs. It behaves well with shrubbery-grouping-position. It handles both the documented #f and the actual "unchanged position" values.

    ;;; Something for Emacs forward-sexp-function etc.

    (define/public (grouping gen pos dir limit count)
      (cond
        [(<= count 0) pos]
        [else
         (block-until-updated-thru gen
                                   (case dir
                                     [(up backward) min-position]
                                     [(down forward) max-position]))
         (let loop ([pos (sub1 pos)] ;1.. -> 0..
                    [count count])
           (define (failure-result) ;can't/didn't move
             (call-with-values (λ () (get-token-range pos)) list))
           (match (grouping-position this pos limit dir)
             [#f (failure-result)]
             [#t 'use-default-s-expression]
             [(? number? new-pos)
              (cond [(< 1 count) (loop new-pos (sub1 count))]
                    [(= new-pos pos) (failure-result)]
                    [else (add1 new-pos)])]))])) ;0.. -> 1..

So if you're fine with users like me doing the defensive thing (I'm fine with that) you could just close this issue (maybe update a phrase in the docs).

mflatt commented 2 years ago

So far, it makes sense to me that shrubbery-grouping-position should change to return #f when it doesn't move. Could it make sense in some situation to claim success, even when the position doesn't change? Or is the idea of using the same position as "no move possible" better because it avoids that possible distinction and so usefully narrows the space of behavior?

greghendershott commented 2 years ago

TL;DR: I think the latter, although it's not a strong opinion.

Although I don't want to over-think this: With APIs sometimes I try to think of mistakes people might make, and/or unnecessary "ceremony".

From that point of view:

I think it's better to return the original position -- especially if the text% "primitives" sort of lead you to do that (?) -- instead of asking grouping-position authors to notice and carefully return #f instead.
The indentation and range-indentation functions return #f to mean "use default s-expression behavior". So it is a little "weird" that grouping-position instead uses #t for this. The "ternary" result is easier to mess up.

So I lean toward having it return the position (changed or not), or #f to mean "s-expression", and that's it.

Having said all that, on the other hand:

If the number of people writing such things can be counted on one hand, and it's easy to get help, maybe it doesn't matter.
Bugs with navigation in tools are fairly blatant (and in my experience highly likely to be reported :smile:) so it's unlikely mistakes here would subtly fester for long.

On the tool/consuming end, I will probably do something defensive, regardless.

mflatt commented 2 years ago

@greghendershott Thanks for the explanation!

I was convinced... but then looked at how the newly extracted Racket-navigation functions will end up providing the implementation of methods like find-up-sexp and get-backward-sexp. I think some of those implementations can return the input position as opposed to #f, at least in some cases. I'm not sure, but if that does happen, then it seems like there's also chance that something relies on the current specific behavior. So, I'm inclined to leave this alone, after all.

mflatt commented 2 years ago

I've pushed changes to syntax-color and gui to help sort this out.

The syntax-color package now provides a color-textoid<%> interface that specifies the subset of text:color<%> methods that indentation and navigation functions can use in an environment without framework or racket/gui. The syntax-color package also provides the default Racket S-expression indentation and navigation functions that were formerly built into text:racket-mixin. The text:racket-mixin implementation calls the ones in syntax-color; for example, the find-up-sexp method is implemented using the racket-up-sexp function that uses only methods in color-textoid<%>.

The default indentation implementation uses a table of rules that, in general, can be configured as a preference in DrRacket. The syntax-color layer exports a default configuration and a conversion from a serializable configuration to a function that can be passed to the S-expression indenter from syntax-color. So, a configuration could be loaded and parsed without using framework.

Finally, expeditor now uses the Racket indenter from syntax-color by default.

I think all those changes might close this issue, but I'm not sure.

greghendershott commented 2 years ago

Thank you!!

I caught up with these changes this morning, as well as applying some more of your recent commits in expeditor/object.rkt to my similar class. So I have commit 4854c32 and tests against snapshot pass.

I'm still planning to move that to syntax-color-lib as we discussed. But for the next N days I'd like to dog-food and exercise this more, while it's still in the same repo as Racket Mode's .el and .rkt code that uses it.

Although I'd like to follow up on https://github.com/racket/expeditor/issues/10 I think we can close this one!! :tada: