meta-data to identify macros that introduce new symbols

hlship commented 10 years ago

In reference to letfn, etc., would it be possible to introduce a meta-data value that would identify which parameter, in a macro, was to be introduced as a symbol or binding form?

(defmacro let-like [^:binding-form bindings & body] .... )

When Cursive is analyzing this code:

  (let-like [foo bar
              baz biff] (x foo baz))

It could use the macro definition to identify that foo and baz are symbols that can be referenced int he body of the form.

Possibly, we'd need an additional meta-data value for the body itself.

It's a thorny problem.

sfnelson commented 10 years ago

As @cursiveclojure has pointed out elsewhere, this would have wider impact than simply IntelliJ users -- it would require library authors to follow the convention (or accept patches that do), and if widely deployed, it should be adopted by emacs' clojure mode too. It should probably be a clojure community decision whether to introduce this notation, then added to the standard libraries. Perhaps, as users of the superior Clojure IDE we could come up with a proposal and start a discussion on the clojure mailing list?

Personally, I'm in favour of implementing this approach in cursive now, and letting others play catchup later :-)

cursive-ide commented 10 years ago

Yeah, this is the million dollar question. Personally I'm in favor of a non-intrusive solution rather than metadata because I'd like to be able to use third-party libraries now rather than whenever they get around to adding the metadata (or accepting my patch, or whatever). For reference, here's more or less what I use internally now.

Support for a particular library publishes functions which calculate the available symbols for a particular form, and specifies their scope. There is some additional complexity in that I have to calculate public symbols separately from local bindings, because the public ones are indexed and the local ones are not. I can't combine them because frequently local symbols require indices to calculate, particularly in the case of more complicated forms like the protocol family or proxy. Indices cannot be used when calculating other indices (that's the part at the start of IntelliJ startup when you can't do much), so public symbol calculation is necessarily more restrictive. Publishing support looks like this:

(resolve/register-locals :clojure.core/ns ns-symbols)

ns-symbols is just a function, so they can be reused or abstracted:

(resolve/register-globals :def (global-name-symbol true false false))
(resolve/register-globals :clojure.core/defn- (global-name-symbol true true true))

The parameters to global-name-symbol are with-docstring? with-params? always-private?. My hope is that a library of these functions would be built up over time such that adding support for a new library is relatively trivial.

These functions return a map, which looks like this:

{<scope> {<sym> <target>
          <sym> <target>
          ...
 <scope> {<more syms>...}}

Here scope can be either something like :public, :private, :ns or it can be a PsiElement (basically an element in the IntelliJ AST) in which case the scope is limited to that element. It can even be a map, supporting things like {:scope <element> :after <element>} which means the scope is limited to the first element but only after the second element. This is used, for example, for function parameters and let bindings. sym can be either a symbol name or an actual symbol element, in which case only that exact symbol resolves - otherwise any symbol with the correct name resolves.

Internally I'm using something more complicated than that, but that's the basic idea. I'm still working out a lot of the details, which is why I haven't published the API yet - I do plan to.

I don't think simple declarative metadata is expressive enough to cover all cases, based on my experience writing these functions - macros can and do do some crazy things. Destructuring is just a macro, for example. We could hard-code some meta-data value for destructuring but then what about the destructuring in Compojure, which AFAIK is subtly different? As it is there are some reference types I'm struggling to support elegantly (e.g. type/record constructor functions, which don't resolve to anything in the source code). Storm, to name one example, gets a class and creates symbols for all the public fields - those symbols aren't explicitly represented in the code either. I'm also having a lot of difficulty with forward declarations and redefinitions (see #108) since a reference in IntelliJ should resolve to one thing but here it sort of resolves to several. I could go on and on with edge cases.

I realise there's a good argument for solving the 95% case, but I'd like to make sure that there's at least a path to the 100% case, and I think this sort of support is the only viable solution for it.

cursive-ide commented 10 years ago

BTW comments on the above more than welcome. This is also related to the discussion on #18.

fbellomi commented 10 years ago

I really like @cursiveclojure proposal. I agree that the most general case is a map from the macro symbol to a function that resolves the reference for the specific form. I would suggest to implement the API and then publicly open a discussion to iron out the small details and the corner cases.

tslocke commented 10 years ago

Would it be possible to macroexpand the source file and then run the analyser on the expanded version? Some post-processing could then be used to map the symbol definition information back to the correct location in the real source file.

Given that macros can be arbitrarily complex, something that relies on Clojure's macroexpand would seem to be the only viable 100% solution.

p.s. in the mean time, is it possible to disable IntelliJ's highlighting of symbols that cannot be resolved?

cursive-ide commented 10 years ago

Yes, you can disable this in Settings→Clojure→Highlight unresolved symbols.

Automatically expanding macros in the editor isn't really feasible for a bunch of different reasons. I think the only really feasible solution is the extension API described above - that should hopefully be coming soon.

tslocke commented 10 years ago

Could you expand (ha ha!) a bit on why macro expansion is not feasible?

fbellomi commented 10 years ago

The problem is exactly the fact that, as you say, macros are arbitrary code. That is, when you run a macro

1) it might never terminate (and of course you cannot decide beforehand if it's terminating -- that's the Halting problem)

2) it could use a large amount of resources (memory, time)

3) it could format your HD

You can implement a sandbox, but it's a complex matter. There is clojail, and the JVM sandbox, but essentially they are based on a blacklisting model: you specify what behaviours you wand excluded, which is long and error prone.

CrossClj (http://crossclj.info) uses macoexpansion+analysis+evaluation (you also need actual evaluation, because the successful analysis of the n-th form in a file might need a var defined by the evaluation of the (n-1)th form). I've seen both cases 1) and 2) above, on libraries published on clojars. I've never witnessed case 3) but I run the whole analysis process in a separate VM instance -- just in case.

I think Cursive is currently doing an amazing job in being so accurate without using actual evaluation.

Francesco

On Sun, Jun 22, 2014 at 3:49 PM, Tom Locke notifications@github.com wrote:

Could you expand (ha ha!) a bit on why macro expansion is not feasible?

On Sun, Jun 22, 2014 at 2:45 PM, cursiveclojure notifications@github.com

wrote:

Yes, you can disable this in Settings→Clojure→Highlight unresolved symbols. Automatically expanding macros in the editor isn't really feasible for a bunch of different reasons. I think the only really feasible solution is

the extension API described above - that should hopefully be coming soon.

Reply to this email directly or view it on GitHub:

https://github.com/cursiveclojure/cursive/issues/147#issuecomment-46780219

— Reply to this email directly or view it on GitHub https://github.com/cursiveclojure/cursive/issues/147#issuecomment-46781561 .

Francesco Bellomi

cursive-ide commented 10 years ago

There are also other interesting cases - for example, the Storm macros previously mentioned, which create symbols from the fields of a Java class to represent configuration. This means that you need that class compiled and loaded in order for the macro to work.

tslocke commented 10 years ago

The beauty of a LISP is that the core language is very small, and the rest is defined in terms of itself. This strength should apply to tooling as well as the language -- we should have better tools, because they only need to support the small core.

With macro expansion, it should be possible to resolve symbol definitions with zero extra configuration. This would be really nice, and is what we should expect from a LISP environment. I don't think any of the problems raised so far are insurmountable.

The concerns raised by @fbellomi could be addressed by using an nREPL connection into a VM for the expansion. On the other hand, anyone not using a VM for their dev code in is taking the same risk already; there is no additional concern.

Of course, the suggested solution could be a very good compromise and I certainly look forward to a future version with this capability : )

fbellomi commented 10 years ago

Another option is while-listing a (small) set of known macro that are known to be both safe and well-behaved in terms of resources, and giving the user the ability to white-list other macros under their own responsibility (maybe using a quick shortcut, like class import). I think covering the top 15 most used libraries would account for 90% of users' code bases.

IMHO, this is viable only in Cursive can do is magic by relying only on macroexpansion and intellij-optimized syntactic analysis of the result; using the full process in the clojure toolchain (macroex+analysis+eval) in the IDE's background would be quite heavy.

lucascs commented 10 years ago

Since we can already define indent options for a macro, isn't it ok to add an intention for the user define which macros define symbols, as pointed out in #242?

Perhaps something like: "Behaves like a def", "Behaves like a defn", "Behaves like a let" I use a bunch of macros from libraries and I've written a couple of macros myself that would fall into one of these categories.

wilkerlucio commented 10 years ago

I was thinking the same thing as @lucascs.

@cursiveclojure what you think about that idea? Being able to "mark" behavior, in same way we do for indentation right now.

cursive-ide commented 10 years ago

Yeah, I go back and forth on this, but I'm thinking about adding something like this as a workaround until the extension API is done. There are some issues, principally that after adding new forms Cursive has to reindex everything so it won't be immediate but I know this is a real pain point for people.

wilkerlucio commented 10 years ago

yeah, the major problem is when you have a lot of definitions being created with macros, def* like functions... because once you use then, a big chunk of you project get's out of the index and all those issue alerts pops up... in a project here I'm actually using a gigantic (declare) statement in order to have the symbols available for Cursive, not ideal, but it's doing until we have something better, hehe.

danielcompton commented 9 years ago

Would it be possible in the interim to introduce a relaxed inspection mode which doesn't highlight unresolved symbols when they are inside a macro expression (excluding defn and other clojure.core fns)? The use case I'm thinking of is when I want to run an inspect to check if there are any unresolved symbols, but symbols inside macros make too much noise to pick out the one or two actual problems.

atroche commented 9 years ago

Any update on what the workaround's going to be?

jugimaster commented 9 years ago

Can you introduce some sort of "stop-gap solution"? People have had this problem ever since Cursive came out.

a613 commented 9 years ago

Does emacs resolve symbols defined in macros? If so, Cursive should absolutely be able to. If not, then let's give up early.

cursive-ide commented 9 years ago

@trimtab613 Emacs doesn't do anything like this, no, but that's no reason to give up.

aboy021 commented 8 years ago

Is the solution to this related to the talk "Improving Clojure's Error Messages with Grammars - Colin Fleming - YouTube"?

If so, perhaps there could mechanism for registering macro specific grammars with cursive?

You could have a couple of base cases that support things like the aforementioned "behaves like defn" and "behaves like let".

Ideally this would be stored in some sort of standard, stand alone, format so that if library contributors wanted to add the functionality it would be simple and non-invasive to do so. Potentially other tools could make use of the grammars then too.

guv commented 8 years ago

Is there any (planned) work on a solution, yet? A quickfix where the Cursive user can specify the macro semantics would be a good start (even if it had to be manually in a separate configurable file per Cursive installation) . Currently, there is a lot of noise from "cannot be resolved" messages in projects using custom "def*" macros and other very useful features of Cursive don't work with functions defined via custom macros (documentation view, auto-completion, ...).

a613 commented 8 years ago

My company is pretty much split between Emacs and Cursive, so inline indicators probably wouldn't go over well in the Emacs crowd. I put my vote behind a global or per-project configuration file that can be generated bit by bit by a command in the context menu. Something like "Manually Define Symbol", which would display a dialog asking for details then write the info to that config file.

guv commented 8 years ago

Just as a clarification: I vote for a macro semantics definition outside the actual code as well. Ideally, a per library configuration file (maybe distributed with the library). A global configuration (per Cursive install) as a quickfix.

jzwolak commented 8 years ago

fbellomi, if I'm including a library in my project and expanding the macro while editing formats my hard drive then that just moves the problem from when I run my code to when I edit my code. I trust the libraries I import in to my project, so there is no need to sandbox them while macro expanding to discover defined symbols. I'm going to run the code anyway, might as well run it while editing and get the benefit of symbol resolution.

The halting problem is a real one, but most IDEs already have functions to address things like this during analysis and other background tasks. NetBeans for instance automatically detects slowness in running tasks and reports this to the user. Such an approach could be used to give this symbol resolution feature x amount of time to complete and if it doesn't complete then a warning is issued to the user and the analysis is either cancelled or the user is given the option to cancel/continue. The x amount of time can be per macro expansion.

I really like the idea of solving this problem without adding an API and meta data.

With meta data the burden is passed to humans... let's not burden ourselves to adding meta data for eternity (every new macro that defines symbols will need someone to type the metadata) if we can write a solution once and the computer can work on it for us.

Yes, it is a challenge to write an automated solution in Cursive. That's exactly the kind of challenge I believe the Clojure community is up for and exactly what sets Clojure apart: good solutions to tough problems instead of lazy solutions that are almost good.

cursive-ide commented 8 years ago

@guv Yes, I'm planning to try to implement this for the next major release, hopefully within a month.

@jzwolak It's not a lazy solution, it's a pragmatic one. Even expanding macros isn't sufficient in some cases, for example Storm creates a bunch of symbols from the public members of a class. This means that in order to do that, the analysis process requires the whole project to be compiled and the classpath set correctly - it's far from trivial and would lead to terrible performance. Creating inverted indexes is hard from runtime inspection. The API is not just for symbol resolution but will be required to provide information for other aspects of Cursive's functionality (type inference, method implementation creation etc) which cannot be determined by macro expansion alone.

gnl commented 8 years ago

@cursive-ide Is the implementation you're planning going to work in ClojureScript as well?

cursive-ide commented 8 years ago

Yes, definitely.

cursive-ide commented 8 years ago

@aboy021 Sorry, I missed this message earlier. Yes, this is related to my talk, and the current internal solutions use grammars as shown there.

guv commented 8 years ago

@cursive-ide That is great news! :)

jzwolak commented 8 years ago

@cursive-ide, thanks for your thoughts on this. I hope I wasn't sounding too critical by using the words "lazy solution". I do also see the need for pragmatic solutions.

The whole project only needs to be compiled once in the example with Storm that you mention. IntelliJ takes about that amount of time building indices either at startup or when something significant changes in the project. Once this initial compilation is performed and all symbols are resolved only the changed file and files depending on it need recompiling. For most projects and most cases this will only be a handful of files. If the developer is working on Storm itself and modifying those files with macros that scan classes for methods to make Clojure functions from then presumably in this case the developer would have the option of turning off this feature for this project... or use this API we're talking about. I was reading somewhere about the recommendation for restraint in using macros... perhaps it was in the "Joy of Clojure". There are developers of libraries and developers of applications, my intuition is that macros will and should appear less in applications and are more useful in libraries. I understand the performance issues and I get this is a challenging feature to automatically scan for these dependencies... my current opinion is leaning towards a hybrid after reading your message and thinking about it. The API would be very useful in the Storm case, but the automatic analysis is not impossible and would be very useful in the majority of cases where macro expansion and recompiling a couple files is involved.

cursive-ide commented 8 years ago

I've made the first steps towards macro support in the latest Cursive release (1.3.0-eap1, out now). Details are here, basically you can select a problematic macro form and say "this works like let" or "this works like def". It's pretty limited so far, but just with those simple cases I hope to solve perhaps 70-80% of the pain of this problem. Feedback welcome on whether this achieves that. Full support is on the way.

sfnelson commented 8 years ago

Thanks for this Colin, it's a big improvement. One major problem with the current incarnation is that these settings are stored in a user-specific location, rather than project specific. This means these settings cannot be easily shared amongst team members.

It would be extremely helpful if these settings could be configured per project, and stored in the project's .idea directory, e.g. codeStyleSettings.xml.

guv commented 8 years ago

How is the progress of the general solution for macros?

guv commented 7 years ago

Since it has been a while: Has the general solution remained a goal or has it been abandoned?

cursive-ide commented 7 years ago

The general solution remains a goal, definitely. I've made some steps towards this such as the "Resolve as..." functionality and the stubs generation. The full solution is a lot of work, but I'm working towards it bit by bit.

a613 commented 7 years ago

Still awaiting this. We use some libraries that generate symbols with macros (e.g. soda-ash), and we've got warnings about undefined symbols all over our code because of it. The emacs crowd is winning because their IDE actually recognizes that the symbols are used, and can even jump to the macro they're defined in.

cursive-ide commented 7 years ago

@a613 Libraries like soda-ash are exactly what the stub generation was designed for. I'll add that to the list, and try to bump up making the list configurable so you can add that sort of lib yourself.

sfnelson commented 7 years ago

While you're looking at macros, please take pity on https://github.com/cursive-ide/cursive/issues/953

MrEbbinghaus commented 6 years ago

Anything new on the topic of custom resolving? Maybe just a simple solution like a check against a regex for every symbol? Or the option to disable the resolving for a particular macro?

Additionaly: I have something like this:

(defaction my-action
  :contexts [a b]
  :parameters [c d]
  (do-something-with a b c d)

This may be a complicated matter as the symbols themselves are not known for the specific macro. I wonder If something like this could be done with spec.

guv commented 6 years ago

This is stalled for quite some time. Might there be a more pragmatic approach compared to your current envisioned solution for custom macro support?

p-himik commented 4 years ago

Another, and IMO the best, approach is to have a section similar to Settings | Editor | Language Injections that would allow customizing the way Cursive resolves unknown symbols. Ideally, such a mechanism would:

Allow explicitly specifying whether a particular resolution is project-specific or not (i.e. global)
Save its settings into a file that could be easily tracked with git (so, along with similar files in .idea)
Allow import/export so that people could share the settings for their library

aneilbaboo commented 4 years ago

Given that this has been a very long running issue, could we get a fix that simply allows turning OFF symbol resolution for the body of a specific macro? Perhaps this is already possible through the "resolve as" feature, but it's not obvious. It would be good if that were a top-level choice.

E.g., In this menu,

Allow the user to select "disable symbol resolution"

I know this particular macro could be addressed with defn, but others, for example, which use letfn in the macroexpansion won't have a similar form. (It's also really hard to figure out which existing form might match yours if it's not a common pattern). I really would like a way to just turn it off without losing the highlighting in other parts of my code.

tekacs commented 4 years ago

@aneilbaboo this was just resolved in #2417!

cursive-ide / cursive

meta-data to identify macros that introduce new symbols #147

the extension API described above - that should hopefully be coming soon.

Allow the user to select "disable symbol resolution"