jackfirth / resyntax

A Racket refactoring engine
Apache License 2.0
58 stars 10 forks source link

Refactoring rules that produce broken code shouldn't be applied #10

Open jackfirth opened 3 years ago

jackfirth commented 3 years ago

Currently if a replacement generated by a rule produces code that fails to compile, it's applied anyway. The tool doesn't even check for compilation success afterwards, so this silently produces broken code. Instead, any replacements that don't compile should be ignored.

AlexKnauth commented 2 years ago

Another way that refactoring rules can produce broken code is if the output includes function that are re-defined in the scope around it. For example:

#lang racket/base

(define (and x y) (list x y))

(define (member? e l)
  (if (member e l)
      #t
      #f))

Would be incorrectly refactored into:

#lang racket/base

(define (and x y) (list x y))

(define (member? e l)
  (and (member e l)
       #t))

Which does "compile" but is incorrect due to the output using identifiers that are shadowed in that scope.

jackfirth commented 2 years ago

Yeah, good catch. Can you open a new issue for just that specific case?

I think it should be fixable by looking at the identifiers and scopes in the replacement syntax object. I might also need to change resyntax to add a scope to the input syntax object before handing it to a rule and then remove that scope from the output of the rule, so we can figure out which identifiers the rule introduced.

AlexKnauth commented 2 years ago

On the issue, let-to-define tricky case, https://github.com/jackfirth/resyntax/issues/148#issuecomment-893946390 shows a g function that gets incorrectly refactored. However, a much simpler version of g that also fails in the same way is this:

test: "let binding with conflicting define inside"
------------------------------
(define (g)
  (let ([x 'outer])
    (define x 'inner)
    x))
------------------------------

which incorrectly gets refactored to

(define (g)
  (define x 'outer)
  (define x 'inner)
  x)

There is some no-binding-overlap? code in default-recommendations/private/let-binding.rkt that seems to be aiming to prevent this error, but it doesn't even catch this simplified version.

AlexKnauth commented 2 years ago

I have an idea to fix this let-to-define tricky case which I hope will fix both the simple define and obfuscated macro-generated define versions of g.

My first thought involves exposing visits-by-location to the let-binding rules. They would then look up the let-body in that table by location to get the body scope. Using 2 sets of scopes: outside the let and inside the let, it can construct 2 versions of each of the identifiers that would be bound by the let, and compare those 2 versions with free-identifier=?. If they are free-identifier=?, then it's safe to flatten the scope, but if any of them are not free-identifier=?, that means the inner version has a different binding and the scopes cannot be flattened.

AlexKnauth commented 2 years ago

Roadblock for my idea to fix the let-to-define tricky case: The 2 sets of scopes I would need are not "outside the let" and "inside the let". Instead I need "just barely inside the let but outside the body's internal-definition context" and "fully inside the body's internal-definition context". And I'm not sure how to get the one just barely inside the let.

For example when I run the macro stepper on this definition of g:

#lang racket/base
(define (g)
  (let ([x 'outside])
    (define x 'inside)
    x))

The set of scopes outside the let looks something like:

#(65 module)
#(68 module try-resyntax-2)
#(80 local)
#(81 intdef)

The set of scopes just barely inside the let looks like:

#(65 module)
#(68 module try-resyntax-2)
#(80 local)
#(81 intdef)
#(83 local)

And the set of scopes fully inside the internal definition context in the body of the let looks like:

#(65 module)
#(68 module try-resyntax-2)
#(80 local)
#(81 intdef)
#(83 local)
#(84 intdef)

If I use free-identifier=? on an x constructed with the just-barely-inside scopes and an x constructed with the fully-inside scopes, I'll get #true when there is no binding conflict, and #false when there is a binding conflict.

But comparing an x with the outside scopes with an x with the fully-inside scopes isn't useful... that will just always be #false whether there's a conflict or not.

Using visits-by-location allows me to get the fully-inside scopes, but I don't know how to get the just-barely-inside scopes.

Edit: looking for 'letX-renames in current-expand-observe might help... looking into it

AlexKnauth commented 2 years ago

I added my solution using letX-renames to my PR here: https://github.com/jackfirth/resyntax/pull/166 with these new tests both passing:

test: "let binding with conflicting define inside"
------------------------------
(define (g)
  (let ([x 'outer])
    (define x 'inner)
    x))
------------------------------

test: "let binding with obfuscated conflicting define inside"
------------------------------
(define (g)
  (let ([x 'outer])
    (define-syntax-rule (m a)
      (begin
        (define a 'inner)
        x))
    (m x)))
------------------------------