Implement new middleware op: `suggest-libspecs`

https://github.com/clojure-emacs/refactor-nrepl/tree/v3.5.4#namespace-aliases is currently used for cljr-slash. As the op name says, it returns the namespaces aliases contained in the current project.

However, a more sophisticated (and context-aware, for .cljc files) cljr-slash needs more than simply a project map. We want to compute suggestions, based on logic, besides from project data.

The best place to perform this logic is refactor-nrepl (vs. clj-refactor), so we'd need a new piece of middleware. This prevents namespace-aliases from being bloated and possibly inflicting breaking changes.

As an additional benefit, the middleware responses for cljr-slash will be much smaller (just the useful info vs. a huge map, 99% of which will be unused).

The `suggest-libspecs` middleware op

suggest-libspecs takes as inputs:

a lib prefix (e.g. io, representing what the user just typed io/, triggering cljr-slash)
a language context
- one of: clj, cljs, cljc
- Represents the extension of the user's filename
A set of preferred aliases
- Represents the whole value of the cljr-magic-require-namespaces defcustom
- See also: https://github.com/clojure-emacs/clj-refactor.el/issues/530

out of those inputs, and of project analysis, it decides which libspec(s) can be suggested to the user.

The output format is as follows:

["[namespace :as alias]"
 "[another-namespace :as alias]"]

i.e. it is a simple sequence of choices, ordered by inferred likelihood of choice.

There is no need for metadata e.g. {:context :clj}, since the middleware already returns choices that are valid for the provided inputs.

Choices might be expressed as reader conditionals:

["[#?(:clj foo.clj :cljs foo.cljs) :as alias]"]

...such reader conditionals would reflect the already-used reader conditionals from a given project, or be synthetically built when it makes sense.

.choices can include :refer clauses in a couple cases:

for clojure.test in particular, where often one always wants the same :refer clause
when it would enable a valid/useful reader conditional

other than that, a generalized inclusion of :refers is not foreseen.

...and similarly, :include-macros true clauses can be included.

When nothing can be suggested, an empty sequence [] will be returned.

Suggestions are suject to caching.

Example stories

1

Given the user intent `set/`:
When [clojure.set :as set] is specified in clj-refactor's cljr-magic-require-namespaces as OK in both contexts: clj, cljs
And there are no overlapping mappings (e.g. no `[foo :as set]`) in the project's namespaces
Then returns [clojure.set :as set] for set

2

Given the user intent `set/`:
When [clojure.set :as set] is specified in clj-refactor as OK in both contexts: clj, cljs
And there is an overlapping mapping (e.g. `foo :as set`) in the project's namespaces
And [clojure.set :as set] is not used in the codebase
Then returns only foo, because it's already used in the project, and individual projects' practices override Emacs preferences.

3

Given the user intent `set/`:
When [clojure.set :as set] is specified in clj-refactor as OK in both contexts: clj, cljs
And there is another mapping (e.g. `foo :as set`) in the project's namespaces
And [clojure.set :as set] is also used in the codebase
Then returns both

4

Given the user intent `io/`:
When [clojure.java.io :as io] is specified in clj-refactor as jvm-only
And there is another mapping (e.g. `foo :as io`), but only defined in a .cljs file (and/or is determined to be a cljs-only ns i.e. it cannot be `require`d)
And the requested context is .clj
Then returns only [clojure.java.io :as io], because cljs-only namespaces don't influence logic for .clj files
  * The same is true for the opposite direction: clj-only namespaces don't influence logic for .cljs files
    * Can be determined by calling `io/resource` instead of `require`

5

Given the user intent `io/`:
When [clojure.java.io :as io] is specified in clj-refactor as jvm-only
And there is another mapping (e.g. `foo :as io`), defined in a .clj/.cljc file within the project
And the requested context is .clj
Then returns only [foo :as io], because it's already used in the project, and individual projects' practices override Emacs preferences.

6

Given the user intent `io/`:
When [clojure.java.io :as io] is specified in clj-refactor as jvm-only
And there is another mapping (e.g. `foo :as io`), but only defined in a .cljs file, which however can be required in jvm clj (because that libspec is defined in a .cljc file)
And the requested context is .cljc
And `foo :as io` is never required jvm-side in this project 
Then returns 2 things: non-reader-cond form (foo as io) and reader-cond form (clj: java.io :as io, cljs: foo :as io)
  * in that order

7

Given the user intent `io/`:
When [clojure.java.io :as io] is specified in clj-refactor as jvm-only
And there is another mapping (e.g. `foo :as io`), but only defined in a .cljs file, which however can be required in jvm clj
And the requested context is .cljc
And `foo :as io` is required jvm-side in this project in at least one other ns 
Then returns [foo :as io], because it's already used in the project, and individual projects' practices override Emacs preferences.

8

Other test cases found in the GH thread below.

General guidelines for suggestion logic

Assume that project consistency for aliases is the right thing, and nudge users into that direction
- ref, ref
However, don't make wild guesses about what the user meant: offer N choices when there's a reasonable place for doubt
- this can be seen in the difference between examples 6 and 7 from Example stories (in 6 we cannot so aggressively guess what is meant; in 7 the user has already implicitly expressed his intent by using certain customs in other existing namespace files)

Thank you for carefully going through all this and writing it up. I think overall the API spec makes a lot of sense in terms of inputs and the shape of the outputs. I'm also excited about the potential that future recommendations could handle refers, include-macros and reader conditionals more easily.

I have a few comments and concerns though:

I agree we should nudge the users towards consistent namespaces aliases, however, I believe we should do that by ranking suggestions, not by removing available options. I think if a user specifically chose to add a magic require namespace, and it's valid for that context, then they would be surprised if it was not included in the candidate list, even if it's overlapping an existing usage of that alias. If you are convinced we need to elide those suggestions that's your prerogative, but I wanted to voice this perspective.
For the language context parameter, it says the context is computed by the filename. What should it do if the suggestion is executed inside of a specific context block within a cljc file? Ie if the user types: #?(:clj (io/ in a cljc file, should the context to the API be a cljc context because of the filename, a clj context because of the reader conditional, or some combination? My inclination would be towards a combination like cljc,cljs or cljc,clj to allow future recommendations to suggest the correct reader conditionals on the new referral. I don't think we need to support that now, but something to consider for expected API inputs.
As I mentioned in the issue for recommended requires, cljs, clj and cljc are not the only valid language contexts for reader conditionals. The Clojure docs list :cljr and :default, and Babashka supports :bb. Again, I think it's fine to limit the options for now, but worth considering for the future.
I see you mention multiple return values right near the end for multiple matching aliases, but I'm not seeing any of the 7 examples listed showing a case where a project already contains multiple namespaces mapping to one one alias? I think a worst case of this might be something like:

Given the user intent `io/`:
When [clojure.java.io :as io] is specified in clj-refactor as jvm-only
And there is another mapping (e.g. `foo :as io`), but only defined in a .cljs file, which however can be required in jvm clj,
  and a mapping `bar :as io`, only defined .clj files, which can also be used in clojurescript.
And the requested context is .cljc
And `foo :as io` is never required jvm-side in this project, and `bar :as io` is never required on the CLJS side.
Returns `bar :as io`, `foo :as io` and some combination of reader macros?

Even without the overlap with the clj-refactor magic namespaces feature, I think it would be helpful to a consider a few cases with duplicate alias namespace usage like:

foo :as io is jvm only, bar :as io is cljs only, context is cljc foo :as io is cljs only, bar :as io is cljs only, context is cljc foo :as io is valid for jvm & cljs, bar :as io is cljs only, context is cljc foo :as io is valid for jvm & cljs, bar :as io is jvm & cljs, context is cljc foo :as io is valid for jvm & cljs, bar :as io is jvm & cljs, context is clj foo :as io is valid for jvm & cljs, bar :as io is jvm & cljs, context is cljs etc

I don't think we need to enumerate all of them in the detailed way you have above, but I'm think it would be helpful to understand what you think should be returned in a few of those cases to ensure we are understanding each other.

Thanks again for carefully writing this up. While I see that you closed the existing PR, I think the majority of that code should support using this new API as the input. Once we have discussed some of these issues above (in particular question 4), I'll take a look at trying to use that new API with the old code.

For point `1`

I see your point, and it makes sense. This is the kind of decision for which one could as well flip a coin :)

The "decider", for me, is that if we return 1 choice instead of two, that will result in no prompt at all, which is a more streamlined experience.

Also, while one's preferences might be, say, [clojure.java.io :as io], that would be the user's "base preference". That user might work on a team setting where io means something else. So I don't think we're, say, deliberately ignoring users - instead we're observing that declared preferences can be overriden by de-facto usages (on a per-project basis). And users might actively appreciate that!

Note that we're not always agressively nudging users, the last paragraph in OP explains a nuance.

For point `2`

This ended up being the actual expectation: https://github.com/clojure-emacs/refactor-nrepl/blob/74dada947f0f2e1775719eb3d8c0a96b837e65f0/src/refactor_nrepl/ns/suggest_libspecs.clj#L14

i.e. it's compatible with the use case you describe. language-context will simply mean that, "language context". refactor-nrepl won't care how is it gathered.

If you hit cljr-slash within #?(:clj |), the language-context (to be sent from clj-refactor.el) would be clj.
If you hit cljr-slash within #?(:cljs |), the language-context (to be sent from clj-refactor.el) would be cljs.
If you hit cljr-slash within #?(|) (which should be a quite extreme edge case), the language-context (to be sent from clj-refactor.el) would be cljc.
If you hit cljr-slash outside a reader conditional in a .clj file, the language-context (to be sent from clj-refactor.el) would be clj.
If you hit cljr-slash outside a reader conditional in a .cljs file, the language-context (to be sent from clj-refactor.el) would be cljs.
If you hit cljr-slash outside a reader conditional in a .cljc file, the language-context (to be sent from clj-refactor.el) would be cljc.

For point `3`

Yes, we won't place artificial limitations, and the middleware can always be improved iteratively.

For point `4`

I see you mention multiple return values right near the end for multiple matching aliases, but I'm not seeing any of the 7 examples listed showing a case where a project already contains multiple namespaces mapping to one one alias?

An easy case would be when I type foo/ and the project namespaces contain [a :as foo] and [b :as foo]. Then the middleware suggests both choices.

I think a worst case of this might be something like [...]

I'll check it out carefully at some point this week. Either way, the final feature will have all these stories as unit tests (in table form via are - no yucky user stories :) ). Which is to mean, you'll be able to check them out and raise concerns. Such tables tend to be great for iteration.

I think it would be helpful to a consider a few cases with duplicate alias namespace usage like:

I will in the mentioned table. Thanks!

I think the majority of that code should support using this new API as the input.

Would be happy about that 🍻

Thanks for the careful response, I look forward to further details on point 4 when you get the time. I wanted to leave one more comment about point 2 though, as I think there may be some nuance that was lost. If we only send the current context and ignore if the filetype is cljc vs clj or cljs, I believe it will make it harder to recommend a require with an appropriate language context block.

More concretely if operating in a cljc file, and the user types something like #?(:clj (io/, I believe the correct libspec to insert in the require would be something like #?(:clj [clojure.java.io :as io]), as clojure.java.io is not applicable in cljs, so it shouldn't be required in that context. However, if io/ is executed in a clj file it should only insert [clojure.java.io :as io]. Finally, if io/ is executed in a cljc file, outside of a language context, but there are cljs, and clj specific usages in the project, then I think the correct require is #?(:clj [clojure.java.io :as io] :cljs [foo :as io]).

My concern is, if the language context only reports the current active context, how can the middleware differentiate between a top level language context or a nested one? That was why I was inclined to send cljc,cljs or cljc,clj for a nested context as I believe that gives sufficient information to isolate those cases. Again, I'm fine with only supporting top level contexts initially, but I think it's worth keeping this nuance in mind while designing the API.

More concretely if operating in a cljc file, and the user types something like #?(:clj (io/, I believe the correct libspec to insert in the require would be something like #?(:clj [clojure.java.io :as io]),

Alright, got it. Thanks much!

Probably a good API would be to accept these two:

buffer-language-context
- represents the file extension (or more rarely, major-mode, repl type, etc)
- one of: cljs, clj, cljc
input-language-context
- represents the context of what the user is typing.
- outside a reader conditional, its value should be identical to that of buffer-language-context
- inside a reader conditional, its value should be typically one of clj, cljs (not cljc)
- for the edge case of typing #?(io/), you can default to buffer-language-context

LMK if it sounds good to you.

Since the middleware is just a POC I won't immediately add this refinement (it will be avaliable within a couple weeks). You are free to send these two keys in addition to the now-legacy language-context key.

Thanks, that makes sense to me. I'll take a look at re-implementing the completing-read prompt using this API later this week.

@dgtized : we now have refactor-nrepl 3.7.0 / clj-refactor 3.7.0 (to be visible in MELPA soon).

Would be happy to hear how it works. After that and a few refinements (like ensuring a stable sort order for the suggestions) we can change the cljr-slash-uses-suggest-libspec default.

clojure-emacs / refactor-nrepl