jackfirth / resyntax

A Racket refactoring engine
Apache License 2.0
56 stars 10 forks source link

Comment parser failures #149

Open Metaxal opened 3 years ago

Metaxal commented 3 years ago
(read-comment-locations
   (open-input-string
    "(list #;abc z #| c d |# \"#;efg\" ; ijk \n l m) "))
#<range-set:
 #<range: #<inclusive-bound: 7> #<exclusive-bound: 39> #<comparator:natural<=>>>>

but this should return instead 3 comment ranges (0-indexed): [6, 11), [14, 23), [34, 40)

Metaxal commented 3 years ago

Racket's read-syntax does parse 'special-comments' that appear during parsing, with their associated value, but unfortunately they are discarded when constructing the syntax objects :( Ideally, read-syntax should (optionally) parse comments (#|...|#, #;... and ;...\n) as syntax objects with source locations.

One issue is that this messes up with arity (if comments are stand-alone syntax objects) or the tree structure (if comments are consed with the next syntax object).

jackfirth commented 3 years ago

There's a bug in how #| and |# are handled that's easily fixable, but I don't see a way to support #; without either changes to Racket's reader to support reading comments or a full reimplementation of Racket's lexer. I could extend Resyntax to just avoid files containing #; entirely however.

Metaxal commented 3 years ago

Is it possible to extend the read table to override the parsing of #;<s-exp> maybe?

jackfirth commented 3 years ago

Maybe, but I'm not sure, and it seems like it might be tricky. I'd rather just avoid the problem entirely, especially since any solution would be specific to #lang racket/base. It's pretty rare for checked-in files to actually contain #; comments anyway, they're more commonly used for local debugging.

Metaxal commented 3 years ago

Maybe, but I'm not sure, and it seems like it might be tricky. I'd rather just avoid the problem entirely, especially since any solution would be specific to #lang racket/base.

(which is probably 90% of the files racketeers produce, though.)

It's pretty rare for checked-in files to actually contain #; comments anyway, they're more commonly used for local debugging.

someone should have told me :grimacing:

But I'm ok to just filter out files that contain #; (with a message), because it's simple, clean, and effective—it's just too restrictive.