benedekfazekas / mranderson

Dependency inlining and shadowing
Eclipse Public License 1.0
151 stars 14 forks source link

Mr Anderson does not correctly rewrite defrecord references with dashes #73

Open r0man opened 2 years ago

r0man commented 2 years ago

When rewriting dependencies with Mr Anderson and the rewrite prefix namespace contains dashes, defrecord instances are not referenced correctly.

Cider uses the "cider.nrepl.inlined-deps" prefix for it's rewritten dependencies.

Mr Anderson rewrites a record reference such as:

instaparse.gll.Failure

to this:

cider.nrepl.inlined-deps.instaparse.gll.Failure

This is however not the right way to refer to a fully qualified record. The correct way would be to replace the dash in the name with an underscore (but only when referencing classes/records, not functions) :

cider.nrepl.inlined-deps.instaparse.gll.Failure

I worked around this for now by using the "cider.nrepl.inlined.deps" namespace in Cider which avoid this issue.

benedekfazekas commented 2 years ago

I am guessing the problem here is that mranderson does not do the - to _ replace in the prefix only in the actual namespace. should be relatively easy to fix

benedekfazekas commented 2 years ago

happy to look at the PR for this too or at least if you could put together a test case demonstrating it, thx

r0man commented 2 years ago

@benedekfazekas This would be a failing test case:

https://github.com/r0man/mranderson/commit/4a37fcb0e5818d3cdc57ff5d90d64efc965d925e

Unfortunately, not too minimal, since it's also involving Instaparse.

benedekfazekas commented 2 years ago

hm.. this is trickier than I thought originally. mranderson does replacements with java pkg style prefix (with _) in the ns body but in this case nothing triggers using the underscored version, see here: https://github.com/benedekfazekas/mranderson/blob/master/src/mranderson/move.clj#L184-L218

since we just do a simple string based replacement in the ns body (for performance reasons) it is also tricky to check that the word before the sym is instance? but that is what we need to do I guess...

r0man commented 2 years ago

Hi @benedekfazekas,

when I looked at this the last time I also was reading this section. And I was wondering if we could detect the difference between a fully qualified function call and a class usage if we would use the regex from the LispReader.java. There's a comment here:

https://github.com/benedekfazekas/mranderson/blob/master/src/mranderson/move.clj#L211

saying it's too broad, but it would tell us the difference between:

my-ns/fn-call

my-ns.MyClass

in the replace function I believe. I didn't try that yet though.

benedekfazekas commented 2 years ago

after looking at the instaparse.abnf namespace I thought the problematic line (after wrong mranderson replace) looked like this: (instance? mranderson-test27717.instaparse.v1v4v12.instaparse.gll.Failure tree) assuming the original looked like this (instance? instaparse.gll.Failure tree) the only hint here is the instance? fn call before, what am i missing?

r0man commented 2 years ago

Yes, and if we would see mranderson-test27717.instaparse.v1v4v12.instaparse.gll.Failure in the source-replacement function, we could detect that this is a class reference (all namespace parts are separated by a dot, where as a function call would have a slash before the last namespace). If we detect it is a class we replace all dashes in mranderson-test27717.instaparse.v1v4v12.instaparse.gll.Failure with underscores. For function calls we do nothing. But maybe I am missing something. I'm not too sure if I completely understood source-replacement and all the edge cases. But that was at least my initial idea.

benedekfazekas commented 2 years ago

oh, got you now I think... maybe it is just the order of preds in the cond... the big question of course if changing the order breaks other things. will have a play with this

benedekfazekas commented 2 years ago

ok so this is instaparse.gll.Failure vs (require '[orchard.java.parser :as src]) (from https://github.com/clojure-emacs/orchard/blob/master/src/orchard/java.clj#L99) both with the symbol with a dot at the end. you need the context (require vs instance?) to be able to decide if you need java style package with _ or clojure style ns with -

r0man commented 2 years ago

Yeah, ok. Btw. what about rewriting via tools.analyzer or something more clever in mr anderson. I guess you have experience with this, because of clojure-refactor. Would that be too slow in practice? Any ideas? I guess it would be quite an effort but way more reliable. Wdyt?

benedekfazekas commented 2 years ago

the defacto analyzer now is clj-kondo which is rewrite-clj based (just like mranderson btw). I could use rewrite-clj or kondo to process the ns body but not sure how performant would that be... the idea between splitting processing the ns macro and the ns body is that ns body processing should be relatively simple -- this assumption deffo shows cracks here

lread commented 2 years ago

Sounds like we might be talking about some explorations I was wondering about too.

Please let me know if anybody is diving into this as it might affect stuff I'll be looking into sometime soon.

lread commented 2 years ago

I've read the above but I want to make sure I understand.

Is the crux of this issue that MrAnderson does not handle a requires which appear outside of the ns macro? As in: (require '[orchard.java.parser :as src])?

Or is there also another separate issue?

r0man commented 2 years ago

@lread The issue is that if you choose a rewrite prefix with a dash in its name (Cider uses cider.nrepl.inlined-deps for example) Mr Anderson uses this as a prefix to replace fully qualified symbols.

For symbols that reference a function for example this works fine, but it does not work for records. Record references need to be made in Java style, where dashes are replaced with underscores.

In the case of Instaparse Mr Anderson produced this record reference:

cider.nrepl.inlined-deps.instaparse.gll.Failure

But this is wrong. It should be:

cider.nrepl.inlined_deps.instaparse.gll.Failure

Since this is a Java class under the hood.

lread commented 2 years ago

Ah, thanks @r0man, I saw some conversation around require and thought it might be the crux.

But it is simply what you originally reported, it seems!

benedekfazekas commented 2 years ago

well, it is just a tiny bit more tricky than that ;)

the one thing to keep in mind that mranderson analyzes the ns macro (with rewrite clj) but does simple regexp based replace in the ns body.

if the prefix has a dash in it mranderson has no way to decide if it needs to use dashed or underscored version of it in the body IF the symbol itself does not have dash or underscore in it. it would need to look at the surroundings (ie. understand the context) to be able to do it.

lread commented 2 years ago

Thanks for the clarification @benedekfazekas, that helps too.