Open greghendershott opened 3 years ago
Looks like the answer to the questions are both "no" currently:
#lang racket
(require drracket/check-syntax)
(define (run str)
(define annotations
(new (class (annotations-mixin object%)
(define/override (syncheck:add-jump-to-definition . args)
(printf "syncheck:add-jump-to-definition ~s\n" args))
(define/override (syncheck:add-definition-target . args)
(printf "syncheck:add-definition-target ~s\n" args))
(define/override (syncheck:find-source-object stx)
(if (eq? 'the-source (syntax-source stx))
'yep
#f))
(super-new))))
(define-values (add-syntax done)
(make-traversal (make-base-namespace) #f))
(parameterize ([current-annotations annotations]
[current-namespace (make-base-namespace)])
(add-syntax (expand
(parameterize ([read-accept-reader #t])
(read-syntax 'the-source (open-input-string str)))))
(done)))
(run
(string-append
"#lang racket\n"
"(module m racket\n"
" (provide (struct-out s))\n"
" (struct s (a)))\n"
"\n"
"(require 'm)\n"
"s-a"))
produces
syncheck:add-definition-target (the-source 67 68 struct:s (m))
syncheck:add-definition-target (the-source 67 68 s? (m))
syncheck:add-definition-target (the-source 67 68 s-a (m))
syncheck:add-definition-target (the-source 67 68 s (m))
syncheck:add-jump-to-definition (yep 77 84 require #<path:/Users/robby/git/exp/plt/racket/collects/racket/private/reqprov.rkt> ())
syncheck:add-jump-to-definition (yep 42 52 struct-out #<path:/Users/robby/git/exp/plt/racket/collects/racket/private/reqprov.rkt> ())
syncheck:add-jump-to-definition (yep 33 40 provide #<path:/Users/robby/git/exp/plt/racket/collects/racket/private/reqprov.rkt> ())
syncheck:add-jump-to-definition (yep 60 66 struct #<path:/Users/robby/git/exp/plt/racket/collects/racket/private/struct.rkt> ())
syncheck:add-jump-to-definition (yep 60 66 struct #<path:/Users/robby/git/exp/plt/racket/collects/racket/private/struct.rkt> ())
I like having sub-range binders for required identifiers too. However, a more immediate problem is that sub-range information does not exist for required identifiers. Those binding informations are syntax properties attached by the macros struct
(and define-logger
, etc.) in the defining module just like 'disappeared-use
and 'disappeared-binding
.
I like having sub-range binders for required identifiers too. However, a more immediate problem is that sub-range information does not exist for required identifiers. Those binding informations are syntax properties attached by the macros
struct
(anddefine-logger
, etc.) in the defining module just like'disappeared-use
and'disappeared-binding
.
I think we could develop a protocol to make this work (by having the defined identifiers actually be bound to macros that leave information at their use site based on whether or not they are in the same module or not, perhaps? Maybe?). If we had a good sense of what we wanted DrRacket to actually show us it would help guide the design, IMO.
Although there might be a cool UX for drawing arrows between two files, I'm not immediately sure what that would be.
I think one interesting point is, what are the arrows? They are things to draw on the screen in a GUI, or handle some other way in a TUI. But also, they can be treated as arrows (a.k.a. arcs) in a directed graph. So, even if there is no obvious UI, they can still be useful.
And so how I came across this issue/question, was from exploring how to do reliable multi-file rename. (One thing I've realized is that there are at least two useful graphs. One graph is about definitions, whether local or relating to identifier-binding
from-xxx
values. Another graph is about "name introductions", including renaming exports and imports, relating to identifier-binding
nominal-from-xxx
values.)
Anyway that's my main motivation, even though it's not (mostly) about DrRacket UI. I don't know if that helps guide the design at all?
p.s. A simpler point here: It would be nice if jump-to-definition took you to the struct id or the field id, depending where you clicked within the sub-range binders. And that it worked like that consistently, whether the struct definition is local or imported.
Ah, yes! Multi-file rename as the motivation is very clarifying, thank you. More than happy to add stuff to support that one!
Thinking out loud:
The general issue is that, when analyzing require.rkt
, desirable information is available only from define.rkt
.
Available how/when? Some specific tactics I can imagine:
define.rkt
. (This is how jump to definition works. From identifier-binding
we know that the defining modpath is define.rkt
. But to find the specific definition position within, we must analyze define.rkt
now, or have previously analyzed and cached this.)configure-runtime
, but this is not for run-time, it's for for analysis-time (probably don't abbreviate that). But I'm not sure exactly how this would/could work.In the context of implementing a multi-file rename command, in a system that already uses a separate database? The first option is actually not so bad. You have to chase down the entire import graph to find its introduction site (a definition, or a renaming import or export), then fan back out through the graph to find all uses of that name. So you need to analyze e.g. define.rkt
anyway. At such time the sub-range binders info could be captured in the database, and, existing arrows can be updated (single arrows can be replaced with two or more arrows). "Now that I've analyzed define.rkt
, I see that anything pointing to foo-bar
should actually be updated to become separate foo
and bar
arrows." Something like that.
Again, just thinking out loud.
My feeling, in retrospect, is that probably we want to choose "punt" where we actually get moving on https://github.com/rfindler/fully-expanded-store . That said, option 3 I haven't really thought deeply about.
I agree. Punt was my lazy way of saying lazy.
With the lazy approach:
We analyze require.rkt
, and see that there is a single require arrow for s-a
. That's all we can know, for now. syncheck:add-arrow
and syncheck:add-jump-to-definition
work unchanged.
Later, on demand, we analyze define.rkt
, and learn that s-a
has sub-range binders. I wonder, how should check-syntax report this for define.rkt
? Should it still do a single syncheck:add-definition-target
for s-a
, for backward compatibility? And also do some new method that communicates the whole story about s-a
, s
and a
-- so that a client can, say, update the single arrow/jump for s-a
, instead to be two, for s
and a
?
I'm not entirely sure, but it sounds like we'll want two different pieces. One piece that just stores the results from files so we can access them quickly without reanalyzing things, but the some changes to the existing setup so that we don't have different information references that are exported from references that are local.
I'm still mulling this over. Lately, I'm leaning towards the simplest possible idea: Add a new add-sub-range-binders
annotations method. Essentially it would be called whenever traverse
encounters a define-values
with the sub-range-binders
syntax property, supplying that property value.
The upside is that it allows the consumer to handle this however it wants -- including not at all, i.e. the status quo. Some consumers might want to update existing jump-to-definitions ("split" them into two or more). Others might to keep the existing single jumps, and handle the split on-demand (look at the position where the user invoked). Or some other strategy.
The downside being the same: It forces a consumer to decide how to handle this complexity.
However, if there's no clean, non-leaky way to hide this (?), maybe that's the least-worst approach?
Not yet convinced. Just touching base.
I think it would be best if we make traverse actually save and offer the information without expecting the client to do too too much work to take advantage of it.
It does seem wrong that the same identifier (s-a
) gets one arrow when it is coming from a require and two when it is coming from the actual struct. So maybe the way we're conceptualizing the binding information that goes "through" the require is wrong. I think we should not be afraid to change the macro that defines define-struct to start using a new protocol somehow (and we can keep backwards compatibility measures in place almost certainly). Maybe this means that struct-out
has to get in on the game. Maybe this means that the expander has to change. Maybe this means that sub-range-binders was done at the wrong level and the expander really should take a more active role in identifiers that are built out of pieces of a macro's input so that check syntax can be just reflecting the information that the expander has instead of managing it itself.
I wasn't presuming we could take out such a clean sheet of paper. If we can, that also works!
There is definitely some tension between the perspective that s-a
is one name, with one srcloc, for one function... versus the perspective that s
and a
are two pieces, each with its own srcloc, of a name for one function.
If there were some way to accommodate the latter perspective, even down at the level of identifier-binding
, then I think the rest of it would involve simpler relations.
Wishful thinking: A new identifier-bindings
(plural) could report that s-a
is made of two pieces, making it clear up-front that there should be two arrows. [Super wishful, it could report the precise srcloc of each piece. Less wishful, it could at least report some "key" for each (like {s-a a}
and {s-a s}
?) by which the srcloc could be found if/when analyzing the defining file.]
I've been thinking about, and working on, this more.
I have a tentative new perspective: It would be fine for check-syntax to keep the existing add-definition-target
method exactly as-is, and add a new add-sub-range-bindings
method. It's really not that difficult for the consumer to relate the sub-range-binders both to arrows and to definition targets. For example when using a database, a simple join causes the single/full definition target to "fan out" to become multiple arrows, one for each of the sub-range binders.
Admittedly, the answer for require.rkt
will differ depending on whether define.rkt
has already been analyzed (do we know about the sub-range-binders, yet).
However that is already the case for add-definition-target
and jumping to "normal" definitions. The add-jump-to-definition
method simply says, "Here is the file to look inside", and you can't do so precisely until you have an add-definition-target
for the defining file. Sub-range binders don't introduce this issue, they're just another wrinkle on it. You get an approximate answer until/unless you analyze the other file.
It seems to me the crux of this is the choice: Do you want to force expanding and (at least somewhat) analyzing potentially all transitive dependencies of a file, before you can answer any questions about it?
To-date the answer has been, no. For speed, it is good to delay and be "lazy". Maybe the UX in DrRacket could be improved, i.e. it should automatically analyze the defining file when you want to jump to a definition within --- but it could be improved without changing the basic decision to use a "lazy" approach.
The answer, yes, is also interesting. Maybe the expander and/or a store of expanded syntax could eagerly obtain some of the information. For example if the syntax is already expanded, it is probably pretty cheap to walk it looking for define-values
/define-syntax
, and do the add-definition-target
and add-sub-range-binders
business. e.g. Do only that, eagerly, without doing the whole other check-syntax analysis.
In short, I think either approach can work fine. Either incrementally enhance the status quo. Or the "cleaner sheet of paper" idea.
I think we should imagine that we will soon (soon relative to the lifespan of drracket, anyway) be in a world where we routinely cache the results of check syntax analyzing already-compiled files. (Does this help with this particular decision?) I can also imagine that the simple join you mention could be hiding behind an API call too, when the information is available, making it even simpler for clients. I think this means that continuing with the lazy approach will work well.
This is for discussion, not yet a specific proposal.
Status quo
s-a
is treated as two "sub-range binders",s
anda
. There are two lexical arrows (require-arrow = #f
) betweens
anda
. This is nice.s-a
is treated as a single identifier. There's one require arrow (require-arrow = #t
) betweens-a
and"the-above.rkt"
. This seems less nice.Instead?
Should there instead be two require arrows in
require.rkt
?Should
add-definition-target
be called fors-a
with the srcloc for thea
within thestruct
form, fordefine.rkt
? (It might be already, I haven't yet checked.)Should
add-jump-to-definition
be called for thea
ins-a
inrequire.rkt
? (It might be already, I haven't yet checked.)In other words, the sub-range binders idea would apply to imported as well as local definitions.