Open ericdallo opened 3 years ago
FWIW, there is some discussion of trying to work with "incomplete" forms starting here: https://clojurians-log.clojureverse.org/rewrite-clj/2019-11-12/1573599410.186700
The discussion ends on this day: https://clojurians-log.clojureverse.org/rewrite-clj/2019-11-14
The approach might be described roughly as "automatic closing of delimiters".
Where that ended up was this code: https://github.com/sogaiu/rewrite-cljs-playground/commit/7074cfc222aa65c93020bddeed83c0236e4f86b8
Thank you for the response @sogaiu, yeah, that'd be really useful, does that branch already works? How could I test it?
@ericdallo It was a while back, but my current recollection is that it did work in limited testing.
There are only 5 lines to change in a single file, so perhaps it's possible to just make the changes manually on your end to try it?
Great! I'll try to change it and release it local, thank you!
I managed to release a local rewrite-clj with that change, but I probably doing something wrong, I could not use binding
to change the parse-anyway value and even if declaring it with a true
value and releasing again, the parser throws an exception anyway
I don't know how readable the following will be, but here is a section of code I was using at one point that I think used the feature in question: https://gist.github.com/sogaiu/b6286feeaefc326a449af109f4a4b77d#file-script-clj-L217-L245
What you described:
use
binding
to change the parse-anyway value
seems to be basically what I was doing, so I'm not sure my code will be of much use.
I suppose it's also possible that the branch I was using: https://github.com/sogaiu/rewrite-clj/commits/parse-anyway has a few extra bits in it that might be affecting things, but nothing obvious stands out at the moment.
Not sure I am doing anything wrong here, both give me the same exception:
(binding [rpc/*parse-anyway* false]
(z/of-string "(def a 123) asd/ (def b 345)"))
clojure.lang.ExceptionInfo: Invalid symbol: asd/.
{:type :reader-exception, :ex-kind :reader-error}
at clojure.tools.reader.impl.errors$throw_ex.invokeStatic (errors.clj:34)
clojure.tools.reader.impl.errors$throw_ex.doInvoke (errors.clj:24)
clojure.lang.RestFn.invoke (RestFn.java:442)
...
and
(binding [rpc/*parse-anyway* true]
(z/of-string "(def a 123) asd/ (def b 345)"))
clojure.lang.ExceptionInfo: Invalid symbol: asd/.
{:type :reader-exception, :ex-kind :reader-error}
at clojure.tools.reader.impl.errors$throw_ex.invokeStatic (errors.clj:34)
clojure.tools.reader.impl.errors$throw_ex.doInvoke (errors.clj:24)
clojure.lang.RestFn.invoke (RestFn.java:442)
...
So, for completion, we don't actually need to parse, if we simply tokenized with something like indexingpushbackreader, I believe we could avoid a lot of this.
@snoe Do you mean avoid using rewrite-clj at all and tokenize using a normal Clojure reader?
How could we do that @snoe ?
I have an example using edamame for self-repairing code if the code has insufficient closing delimiters:
@ericdallo Sorry, I think I didn't quite understand the original post.
The discussions from before are specifically about handling the case where delimitiers are missing. The related code doesn't handle any other cases.
Very sorry to have taken other folks' time on this :(
@borkdude edamame seems to not work since it works for missing closing delimiters, I tested with something like (def a) foo/ (def b)
and I could not get any info that could help fix that :/
@snoe, the issue with using indexing-push-back-reader manually is that we don't have a simple token, we have the whole document with the line and pos so we would need to "walk" through the safe-parsed text and check line and column, not sure how to do that
@sogaiu, no worries, this issue is to make rewrite-clj
parse a wrong code indeed, that'd be the best approach if available IMO, the discussions from above are just other ideas to fix clojure-lsp
completion issue, but if we have that on rewrite-clj, it'd be super useful
@ericdallo You can continue parsing the stream after an exception. So you would just skip over foo/
.
I managed to make a non proud workaround:
(defn ^:private safe-zloc-of-string [text]
(try
(z/of-string text)
(catch clojure.lang.ExceptionInfo e
(if-let [[_ token] (->> e
Throwable->map
:cause
(re-matches #"Invalid symbol: (.*\/)."))]
(-> (string/replace-first text token (str token "_"))
z/of-string
(z/edit->
(z/find-next-value z/next (symbol (str token "_")))
(z/replace (n/token-node (symbol token)))))
(throw e)))))
But I just realized that it's not enough to handle completion, since we depend on clj-kondo analysis and when the code is not parseable, clj-kondo return no analysis for that file 😔 @borkdude Is there any way to lint a code and ignore non-parseable tokens?
Anyway, I think it worths leaving this issue open to a better fix on rewrite-clj side @sogaiu, thanks all for the help
@ericdallo Feel free to post a minimal repro on the clj-kondo issue tracker and I will see what can be done.
Done @borkdude https://github.com/clj-kondo/clj-kondo/issues/1146 Also, I created an issue on clojure-lsp side to track the whole fix: https://github.com/clojure-lsp/clojure-lsp/issues/270
@sogaiu is there any way to fix that on rewrite-clj? This workaround is not totally reliable :/
@ericdallo Sorry, I haven't yet thought of anything.
May be it's worth consulting @lread about this?
What I did in clj-kondo: I made a tweak to the parser where it parses a token. In case of an exception, it logs this exception to an atom in a dynamic var (this was done so I didn't have to change a lot of code) and just continues parsing as if the token wasn't there at all.
Yeah, that's similar to the workaround I did, but in this case we need the token to completion and etc. Since is possible to have a node
of unparseable code (like I did in my workaround), IMO I think that rewrite-clj should handle that somehow and allow parsing that.
Heya @ericdallo! I am trying to stayed focused on rewrite-clj v1 priorities at this time.
I am getting an idea for what you'd like, but details always help. Add anything you can think of.
I'll lurk/follow this discussion for now.
Thanks @lread, I'm quite excited about that v1 :)
I think the issue description and my workaround are detailed, LMK if you need any extra info.
Reading the original post, it seems plausible to me that an alternative solution could be conceived, that would be just as useful to clojure-lsp, and more convenient to rewrite-clj (because nothing would have to be changed!)
For completing a/
in file.clj
:
file.clj
in a saved state, from disk
a/
from the buffer (in emacs lingo) without rewrite-clj
This has the aditional advantage of being resilient to multiple, unsaved errors.
For instance for the saved file file.clj:
(ns file)
and the non-saved current buffer for it:
(ns foo)
]] ;; an error
a/ ;; the completion
]];; another error!
...a/
completion will still work, even in presence of the other faulty bits.
So, if I understand, we have different technical challenges that might be treated separately and perhaps incrementally:
foo/
:keyword:
"foo\xbar"
- invalid escaped char in string, (already parses today and throws on sexpr
)#::[a]
- namespaced map without namespace^private
- meta without target{:a 1 :b}
- unbalanced maps (already parses today, and throws on sexpr
){:a 1 :a 2}
- map with duplicate keys (already parses today, does NOT throw on sexpr
)#{1 1}
- set with duplicate value (already parses today, does NOT throw on sexpr
)(hey jude dont
(make [it bad)
{:take [a #{sad (song]
I've been looking at parsing code for other work. I am tempted to move 3 (handling unclosed sequences) to a separate issue.
I think we might be able to handle 1 and 2 by loosening any parsing exceptions rewrite-clj imposes and making sexpr
lazy where it is not.
The throw that happens on parse would instead occur on sexpr
.
I'll experiment sometime soon to see if this strategy is actually practical and then come back to work out any details like:
sexpr
of #{1 1}
and {:a 1 :a 2}
(we don't today)Somewhat related issue in edamame: https://github.com/borkdude/edamame/issues/106
Thanks @borkdude!
And it is so nice to see @sogaiu, even if it is just :eyes: on a comment!
Yeah, it sounds weird, let me give some context:
On clojure-lsp, we have a completion code feature, where the user types something and we try to complete the code, so for example if user writes:
foo/b
we suggest to complete withfoo/bar
if that function exists. To that work, we parse the whole file (following the LSP spec) and search for a node at the given row/col, the issue here is that we can't parse the code if it's wrong(not a valid clojure code), so if user tries to complete the following code:that it'll not work since
(z/of-string "(def abc 123)\nfoo/")
will trown a reader exception becausefoo/
is not a valid clojure code.If somehow would be possible to parse that, it'd would be possible to parse that node and get a token node of
foo/
then extract the alias and know what functions are available :)LMK if you need more info on anything