running tns/refresh interferes with AST generation

benedekfazekas commented 8 years ago

the reloaded workflow’s refresh can interfere with AST generation in refactor-nrepl as if you run refresh or reset it can break AST generation as refresh actively unloads some namespaces. That causes some namespaces not found. Might be related to #133

expez commented 8 years ago

This is definitely not related to #133. In that issue the user didn't have an ns form (the entire file was commented out) so that error was to be expected.

It's starting to become clear that the AST building we're doing now isn't OK.

Here are some solutions:

Investigate an isolated bootloader again.
Write our own symbol indexer

I think we could do 2. quite easily if all we want to do is find stuff. I'm already doing this to find macros. The problem with 2. is that we'd still have to build ASTs for some stuff, like finding used and unused local vars.

We should probably create another issue, if we decide to work on getting rid of tools.analyzer.

benedekfazekas commented 8 years ago

re. ns form: point is I can reproduce the no ns error during AST building if I run tns/refresh during AST cache warming.

more importantly we abs. agree that current AST building needs to be changed. some other options:

@Malabarba 's idea (in my own understanding) to only build AST for those nses which are loaded either by cider or by the reloaded workflow or in any other way in the given environment where refactor-nrepl is used. I don't think that would need changes/tweaks in tools.analyzer we just need to use the API which does not eval the code, if it is already loaded we can assume (I think) that we will get the same result as with evaling API.
only eval first level defs, defmethods, protocols, macros etc in other words we make sure that we don't eval first level invokes

re. your point 1. not sure that is a real solution as we would still launch missiles if (launch missiles) is a first level form even if in a separate classloader re. your point 2. that is deffo an option, perhaps the cleanest of all, that would be the cursive way of solving this really. it may give other kind of headaches tho (what about protocols, deftypes, multimethods an other tricky macros creating defs on the fly)

would be really interesting to hear @Bronsa's opinion on all this.

benedekfazekas commented 8 years ago

Patches to demonstrate option 4: strict-eval.zip

Bronsa commented 8 years ago

I understand and share the frustration with having to re-eval everything in order to get accurate ASTs. I myself have to turn-off clj-refactor when editing tools.analyzer because of http://dev.clojure.org/jira/browse/CLJ-1870.

I agree with @benedekfazekas that using an isolated environment for evaluation wouldn't really solve much. @expez's second proposal is what cursive does AFAIU, it has its own problems, namely you can't support arbitrary macros that expand into let bindings (@cursive-ide might want to expand on this)

I like much better the two proposals by @benedekfazekas, and I think that his first proposal is probably the better solution (If i remember correctly, this is what SLIME does aswell -- some functionalities are available only if the code has been loaded)

I'm not sure the second proposed solution by @benedekfazekas would work in all cases, runtime operations on the namespace system (intern, ns-unmap, alias ..) would still need to be evaluated in order to have a valid AST, and it might be really hard to figure out which top-level forms should be evaluated and which one shouldn't.

Bronsa commented 8 years ago

@benedekfazekas I'm not opposed to take the proposed patch for tools.analyzer.jvm, however I'd much rather have needs-eval? to be a function taken from opts and with no default implementation, taking the AST rather then the form. The form can still be accessed by either :form or :raw-forms depending on whether you need to access the macroexpanded or original form

benedekfazekas commented 8 years ago

thanks @Bronsa for your thoughts. happy to make those changes on needs-eval? if the refactor and the cider team is happy to go ahead with this solution.

do you think the eval-white-list is enough extension point to cover most of the use cases? Can you think of any specific details of

it might be really hard to figure out which top-level forms should be evaluated and which one shouldn't.

Bronsa commented 8 years ago

There are a number of libraries that do stuff like:

(defn foo [x y] (intern *ns* y (resolve x y)))
(foo a b)

I.e. interning vars at runtime rather than compile time using def* forms.

benedekfazekas commented 8 years ago

wonder why to do this...

this might be solvable with a well maintained white list tho

cursive-ide commented 8 years ago

@Bronsa is correct that Cursive doesn't eval at all - this means that I never have any problems like the ones described in this issue, and I never have any problems with nses that side-effect during load. It also means that everything works without a REPL running. However it means that I have problems with macros that define symbols, and it means that I've had to re-implement a lot of the language semantics in Cursive (which IntelliJ does as well for e.g. Java).

I still think it's a net win though. I have an extension API which I currently use to add support for forms, and which I'll open up when I finally dedicate enough time to it. Doing this allows me to add sophisticated support for source-level forms (i.e. implementing stubs for proxy/reify etc, type calculation for type inference etc). Importantly, all this works at the source level, not with the expanded code, so there are no issues trying to map a huge blob of expanded code back to the original source which is what the user wants to work with. This means that I can easily support rename for records which also renames the ->Constructor forms (a generated symbol which appears nowhere in the source, and which cannot be renamed independently of the original symbol).

Racket does actually expand everything in a sandbox, but they have a very deep integration with their macroexpander which allows them to handle cases like this relatively cleanly. It's also hookable, so that editing modes for e.g. typed Racket can add information to be shown in the editor. This might be possible using tools.analyzer's macro expander, I'm not sure - it's definitely not with Clojure's.

Malabarba commented 8 years ago

Thanks for the write-up. :-)

Importantly, all this works at the source level, not with the expanded code, so there are no issues trying to map a huge blob of expanded code back to the original source which is what the user wants to work with.

FWIW, the cider debugger maps macroexpanded code back to original source, so at least refactor-nrepl might be able to tackle this part (if it ever comes to that).

cursive-ide commented 8 years ago

Interesting, I'd like to see how that works - do you have a link to the source?

Malabarba commented 8 years ago

@cursive-ide It's in instrument.clj in cider-nrepl. IIRC, it's in the last section of the file.

benedekfazekas commented 8 years ago

also on the tools.anakyzer AST there is a :raw-forms key containing a sequence of the original forms if the code was macroexpanded but the AST is built for the expanded code

vemv commented 3 years ago

Revisiting the OP:

the reloaded workflow’s refresh can interfere with AST generation in refactor-nrepl as if you run refresh or reset it can break AST generation as refresh actively unloads some namespaces. That causes some namespaces not found.

refresh doesn't unload random namespaces though: it unloads namespaces it plans to load immediately thereafter.

Of course, this unload->load process can fail at any point of its execution. In that case the end user should fix that (typically due to a syntax error) and try again.

e.g. cider-nrepl won't work properly after a failed refresh either; that's perfectly natural for the approach that we follow (namely: no static analysis). So refactor-nrepl doesn't necessarily have to work after a failed refresh.

tldr, I wonder if there's a specific problem to be solved. "ASTs caches can contain references to unloaded namespaces" would sound like a problem in refactor-nrepl's approach to caching, and not in t.n.

If not, perhaps we could close this issue - obviously the discussion is a great reference to have but it's very old by now.

vemv commented 3 years ago

btw starting with https://github.com/clojure-emacs/refactor-nrepl/pull/298, we got a slightly better integration with t.n so a few kinds of gotchas disappeared:

analyzing files not meant to be loaded (e.g. example .clj files)
changing the t.n refresh-dirs over time

It might also have decreased the chance for AST cache issues but I cannot assure that.

expez commented 3 years ago

"ASTs caches can contain references to unloaded namespaces" would sound like a problem in refactor-nrepl's approach to caching, and not in t.n.

What does it mean for an ns to be 'unloaded'? AST generation will work even if the ns isn't loaded in the repl (t.a. will will just load it), right? If the ns form contains a reference to a deleted namespace, then the code is invalid (not our fault). If you edit the ns and remove the reference to the deleted ns then the hash of the file content will change and thus invalidate the cache.

FWIW, I think we can close this too until we get a specific bug report.

vemv commented 3 years ago

Unloading a ns removes the ns object (find-ns will return nil for it) and all its vars:

https://github.com/clojure/tools.namespace/blob/e9a40e7db15919bb91cb20f7ad33a76d54a68946/src/main/clojure/clojure/tools/namespace/reload.clj#L15-L19

If the ns form contains a reference to a deleted namespace, then the code is invalid (not our fault). If you edit the ns and remove the reference to the deleted ns then the hash of the file content will change and thus invalidate the cache.

👍!

clojure-emacs / refactor-nrepl

running tns/refresh interferes with AST generation #134