racket / drracket

DrRacket, IDE for Racket
http://www.racket-lang.org/
Other
454 stars 93 forks source link

Outline for syncheck:add-arrow/name-dup/pxpy/renamable #417

Open sorawee opened 4 years ago

sorawee commented 4 years ago

@rfindler here's an implementation of what I discussed earlier in #415.

There are two problems at hand: (1) how can language + macro indicate when it's safe (or unsafe) to rename an identifier? (2) how can Check Syntax provide these information to clients (e.g., DrRacket, Racket Mode) in a backward-compatible manner?

For (2), Racket Mode currently uses syncheck:add-arrow/name-dup/pxpy to decide whether or not a variable is rename-able. For example, Racket Mode deliberately restricts renaming to local variables (require-arrow in syncheck:add-arrow/name-dup/pxpy is #f). This suggests that the fix is to extend syncheck:add-arrow/name-dup/pxpy to syncheck:add-arrow/name-dup/pxpy/renamable. The additional argument renamable? would indicate if renaming either end should affect the other end. When require-arrow is not #f, renamable? will be #f.

Racket Mode would implement both syncheck:add-arrow/name-dup/pxpy and syncheck:add-arrow/name-dup/pxpy/renamable to support both old and new versions of DrRacket. At runtime, it can first check if syncheck-annotations<%> has a method syncheck:add-arrow/name-dup/pxpy/renamable, and if it does, syncheck:add-arrow/name-dup/pxpy would be disabled. When syncheck:add-arrow/name-dup/pxpy/renamable is not in the interface, syncheck:add-arrow/name-dup/pxpy/renamable would simply be unused.

For (1), since currently there's no restriction, it seems to me that to make it backward compatible, it should be an opt-out mechanism. That is, all local connections would make renamable? #t by default, but a syntax property inhibit-rename? setting to #t of identifiers in either disappeared-binding or disappeared-use would make renamable? #f.

The PR is only for discussion. At the very least it still needs tests and documentation. Not ready to be merged.

rfindler commented 4 years ago

I think this is reasonable, but I also think that there are few (very few) languages where you can rename identifiers that aren't spelled the same way. I have mixed feelings about backwards incompatibility here, simply because a lot of the current renaming behavior just looks just like a plain bug.

We could generalize the #t/#f to a property that names a predicate for determining if the renaming is okay. The predicate could be "always okay" or "only if same characters" or "only if the characters are the same in a case-insensitive way" and the default would be the case-insensitive comparison?

The other thing I'm unsure of is where to put this property. The way case-sensitivity works is at the reader, not the expander, so it seems a bit strange for the property to be coming from the macros instead of the reader (somehow). For example, someone might make a meta-language that changes reader parameters and the macros won't know what property to put in.

greghendershott commented 4 years ago

Bouncing among the various issues and PRs, I think I'm getting confused what specific bugs we're trying to fix, and/or features to add.

Having said that:

When I did the rename feature for racket-xp-mode, I had it give a user error "Can only rename local definitions, not imports" when require-arrow is not #f. That seemed "obvious" to me, then. Thinking more, now, I can imagine things that could happen if require-arrow were #t:

And so on...?

And if require-arrow were 'module-lang-require, then something like all of the above but adding an explicit require in the first place to shadow the #lang imports??

Those would be new features, not bug fixes (IMHO). How desirable they would be, relative to the cost of doing it, I don't know.


greghendershott commented 4 years ago

p.s. I guess the new feature imaginings are all hopelessly #lang racket-centric, or at least presume a certain set of surface require forms and sub-forms. They'd only work reliably by working with raw #%require forms, which many user would probably not like to see suddenly appearing in their source. So probably that whole comment is a big "never mind", sorry.

rfindler commented 4 years ago

The prefix-in renaming maybe already works (as the traversals code is smart about that and puts normal arrows in). I do agree that those are fairly racket-specific ideas but they do seem like good ones and it might be nice to think about ways to make them work in a language-agnostic way.

That said, the problem here stems from the observation that we don't seem to have the right information in the fully expanded code to know if a binding arrow implies a renaming possibility or not. @sorawee suggested we could look at the buffer and just see if the head and tail of the arrows are ranges in the editor with the same sequences of letters in them. I think this is a great idea but we are currently thwarted by languages that are case-insensitive and the fear of the unknown (specifically that there are languages out there relying on the non-existence of this check that want to let you rename two different names to become the same name ... for some unknown reason).

So we are discussing and trying to figure out a way forward with the goal of somehow having the traversals.rkt code provide more information to racket-mode and to DrRacket's GUI about which arrows correspond to a renaming possibility and which ones don't.

I hope this makes some sense!

sorawee commented 4 years ago

@rfindler

I have mixed feelings about backwards incompatibility here

What backwards incompatibility? I believe my proposal is backwards compatible.

We could generalize the #t/#f to a property that names a predicate for determining if the renaming is okay. The predicate could be "always okay" or "only if same characters" or "only if the characters are the same in a case-insensitive way" and the default would be the case-insensitive comparison?

The syntax property 'inhibit-renaming? (which currently is either #t or #f) is attached to identifiers. How do you imagine the "predicate" that you mentioned to be supplied? As a part of get-info? Or as a syntax property?

the default would be the case-insensitive comparison

You convinced me in the other discussion to stay away from textual code, so I don't think it's a good idea make case-insensitive comparison the default.

To make it backward compatible, the default of renamable? should be #t (or the predicate (lambda _ #t)).

Note that renaming only applies when an arrow is drawn. There's no arrow between a and A in:

#lang racket
(define a 1)
(define A 1)
A

But there's an arrow between a and A in:

#lang r5rs
(define a 1)
A

In both cases, the default renamable? as #t works perfectly.

The other thing I'm unsure of is where to put this property. The way case-sensitivity works is at the reader, not the expander, so it seems a bit strange for the property to be coming from the macros instead of the reader (somehow). For example, someone might make a meta-language that changes reader parameters and the macros won't know what property to put in.

One use case of 'inhibit-renaming? in a macro is https://github.com/racket/racket/pull/3391. Consider:

#lang racket 
(provide (all-defined-out))
(define a 1)

With the PR, there's an arrow from a to all-defined-out. But renaming a should not rename all-defined-out. When (provide (all-defined-out)) expands to (#%provide a), it should attach the following syntax property for an arrow to be drawn, while inhibiting renaming:

(syntax-property 
 <syntax a>
 'disappeared-use
 (syntax-property
  (adjust-srcloc-to-all-defined-out <syntax a>)
  'inhibit-renaming?
  #t))
rfindler commented 4 years ago

re backwards incompatibility: yes your plan is backwards compatible but maybe some backward incompatibility is acceptable here because we can see this as fixing bugs. That is, I see the basic issue here as a balance between having to support (as yet unknown) languages that want to rename things whose names are different (so far, case-insensitivity is the only real example we know of) and having to go change a lot of macros to insert this new property into them. I hope this helps clarify my earlier comments.

sorawee commented 4 years ago

@greghendershott

Bouncing among the various issues and PRs, I think I'm getting confused what specific bugs we're trying to fix, and/or features to add.

My very original goal is to draw arrows from definitions to all-defined-out, and that's implemented at https://github.com/racket/racket/pull/3391. However, implementing it leads me to discover that Check Syntax's renaming is broken. For example:

#lang racket
(require racket/list)
(provide (all-defined-out))

Renaming require to my-require would result in:

#lang my-require 
(my-require  racket/list)
(my-require  (my-require))

Indeed, Racket Mode doesn't have this problem because as you said, it doesn't allow renaming when require-arrow? is not #f. However, the arrow to all-defined-out will start to screw up Racket Mode, since require-arrow? is #f in that case.

So I'm trying to come up with a general mechanism to specify when it's safe to rename.

Thinking more, now, I can imagine things that could happen if require-arrow were #t:

I think Check Syntax could be improved in this aspect. Check Syntax currently ad-hoc recognizes #%require specially and attempts to reverse engineer the fully-expanded form. There are various problems with this approach (e.g., #407).

A more principled solution, I believe, is by attaching syntax property disappeared-binding when expanding require, in the same way I'm attaching 'disappeared-use when expanding provide. It's less likely to be buggy, and as a plus, we can have a more fancy arrow drawing. For instance, currently an arrow is drawn from first to racket/list in:

#lang racket/base
(require (only-in racket/list first))
first

But by adjusting the expansion of only-in, we can make it draw an arrow from first to first easily, and I believe this would be the infrastructure for the "new features" that @greghendershott mentioned.

sorawee commented 4 years ago

Currently there's an arrow from ' to quote in:

#lang racket/base

(define (quote x) 1)
(define abc 3)
'abc

This would be another place where we should inhibit renaming, probably by adding the syntax property via the reader. Technically it's OK to rename, but because there's no space between ' and abc, the resulting code is likely going to be incorrect.

EDIT: also, abc won't be wrapped in the new symbol.