Closed bellini666 closed 2 years ago
For cwd
, a good solution might be to use the same function nvim-lspconfig uses to resolve root_dir
:
local sources = {
null_ls.builtins.diagnostics.flake8.with({
cwd = function(params)
return require("lspconfig")["pyright"].get_root_dir(params.bufname)
end,
}),
}
This would remove the dependency on pyright
launching, and you could reuse the same function for multiple sources for the same language (or even make a factory to handle different language servers).
Prioritizing project executables is a bit tricky. You can see here the logic I used to do this for Node in a separate plugin that integrates with null-ls. The idea was to traverse the current file's parents and check if node_modules/.bin/$CMD
is executable at each parent. I took a stab at making that logic a bit more generic and came up with this:
local lsputil = require("lspconfig.util")
-- for node, the prefix would be "node_modules/.bin"
local resolve_local_executable = function(cmd, prefix)
cmd = prefix and lsputil.path.join(prefix, cmd) or cmd
return function(params)
local client = require("null-ls.utils").get_client()
local cwd = client and client.root_dir or vim.fn.getcwd()
local local_bin
lsputil.path.traverse_parents(params.bufname, function(dir)
local_bin = lsputil.path.join(dir, cmd)
if vim.fn.executable(local_bin) > 0 then
return true
end
local_bin = nil
-- use cwd as a stopping point to avoid scanning the entire file system
if dir == cwd then
return true
end
end)
return local_bin or cmd
end
end
If that seems useful (it solved things for Node, at least) we could streamline it into a config option to avoid having to fill in the command, ideally something like this:
local sources = {
-- no prefix
null_ls.builtins.diagnostics.flake8.with({
prefer_local = true,
}),
-- prefix? unless this is confusing
null_ls.builtins.diagnostics.prettier.with({
prefer_local = "node_modules/.bin",
}),
}
I don't work with monorepos, so I'm not sure if this solves your issue. I'm open to other ideas to improve support here as long as they aren't language-specific (in which case I think they belong in a separate plugin).
Hey @jose-elias-alvarez (btw, by your name it looks like to me you are brazilian. If so, cheers from another brazilian myself here :) )
Thanks for the explanation. I still have to test it here, but if I understood correctly the code, params.bufname
is the absolute path to the file currently opened at the buffer right? So it would transverse starting from it and upwards, until it finds the my custom local bin.
If so, then yes, I think it totally works. And the implementation with prefer_local = ".venv/flake8"
would be very easy to configure and handle a lot of cases.
Also, even though the solution you proposed for the cwd
works fine, could there be a way to integrate it with prefer_local
. For example, if you have prefer_local = "node_modules/.bin"
that means that it is very likely that you will have a .prettierc
in that same directory (or in flake8's case, a .flake8
file). In that case, since null-ls already found the local bin, the path where that local bin is should be the same as the cwd
that we want null-ls to run from.
So, we could think of 3 possibilities to solve the whole problem with only the prefer_local
option:
1) If prefer_local
is defined and it is found, it will default the cwd to the path where local_bin was found (e.g. somefolder/node_modules/.bin
the cwd
is somefolder
, otherwhise is the default one. Obviously one can still pass a cwd
function to customize it.
2) Don't force the cwd
by prefer_local
, but pass as a param to the custom cwd
function that the local bin was found in that specific dir (which could default to root_dir if it wasn't found or be set to nil). Then it could choose to return that.
I like the first approach more, but both are suggestions :)
What do you think?
Haha, not Brazilian but Venezuelan! Your understanding is correct. One small thing that I didn't mention is that when command
is a function, it's only called the first time that the source runs. (This happens when a file is first opened for diagnostics sources and when a file is first formatted for formatting sources.) This means that if a project contains multiple local copies of an executable, command
will stay "stuck" on the first copy it finds. Again, I don't work with monorepos, so do you think this case is common enough to be an issue?
I also like the first approach where we automatically set cwd
based on the path of the local executable. From what this user mentioned on a related issue, it wouldn't handle every case, but as you said we can use it as a default only if the user doesn't manually define a cwd
function.
Let me know once you're able to test out the code and verify that it solves your case, and let me know about the command
issue above. It should be fairly straightforward to implement this, so I can start working on it once I have your confirmation (unless we need to make command
completely dynamic, which would be tricky).
Hey @jose-elias-alvarez ,
Venezuelan, close enough hermano :)
Just tested and the solution works fine! I would say that, if implementing that prefer_local
, it would be nice if it could receive a list of possibilities, so it would check each one, starting from the first one, up until the ls-null root and, if none are found, than the default one is returned.
Now, regarding the command
issue... It is not a problem for the projects I'm working on now but I can see it being a problem in some cases. E.g. I created another directory with a python file in my repository and, if I open that file first, since it is not a descendant of the main python directory, it will stick with the default flake8
and use it for when I open other python files from that main directory.
Maybe each buffer should have its own command
, just like cwd
? And based on the suggestion on linking both, where cwd would use the one found by prefer_local, it actually makes sense, or else it would stick to that too.
Let me know of what you think and if you need any help with that. I'm not very proficient with lua (just ported my 10 years old vim config to lua 3 days ago, never used it before =P), but it is a very easy language to work with
Thanks for the feedback! I can see that having a dynamic command could be important for some monorepo structures.
I think a good starting point would be to avoid changing the current command
field and add a new field (something like dynamic_command
) that runs every time the source runs. Setting prefer_local
will automatically create a dynamic_command
to find the nearest copy of command
in the project. Also, if the user hasn't specifically defined a custom cwd
, it'll also cause cwd
to inherit the base path of the resolved executable.
The time it takes to run the search is small, but in the future we can improve performance by saving the last resolved command
and cwd
to a table indexed by buffer number. In cases where the scan finds multiple possibilities, we could then ask the user which one they'd like to use for the current buffer and save that to the table to avoid having to ask each time.
How does that sound?
Well, this seems fine to me! :)
I'll commend on some things that I notice, they might or might not make sense (I'm just looking through the code, I'm no expert in it yet), so take them with a grain of salt =P
1) I saw in this line that the cwd function gets called at the exact moment when the source is going to run. Wouldn't just changing the command
to run there too basically fix the issue?
Yeah, I know that you probably don't want to do too much every time the source is going to run for performance reasons, which leads me to my next point:
2) I think it is safe to argue that although command
and cwd
can be different for each buff (the whole idea of this issue), once you find them for that buff, they should not change.
So, considering that even cwd
gets called everytime I type something at the same buff (I tested here by putting a print("foobar")
inside my custom cwd function), both could see a good performance improvement if they get called once for the buff.
So, if my lsp-config understand is correct, the on_attach
function gets called once (and only once) everytime a new buffer gets opened so the language server can "connect" to that buffer. This seems like a good opportunity to call the cwd
function and the command
function and cache them for that buff in question. That way, no need to even call cwd
everytime the source is going to run
For example, I do that for pyright
to set my custom python interpreter when it inits:
nvim_lsp.pyright.setup({
before_init = function(_, config)
local p
if vim.env.VIRTUAL_ENV then
p = lsp_util.path.join(vim.env.VIRTUAL_ENV, "bin", "python3")
else
p = find_cmd("python3", ".venv/bin", config.root_dir)
end
config.settings.python.pythonPath = p
end,
})
I actually use before_init
instead of on_attach
because I want to set the pythonPath
before it starts, but the idea is the same. It still gets called once per buffer. For example, with that if I open a file in backend/foo.py
and it finds python at backend/.venv/bin/python3
it will use that for that buffer. If I open another one, for example another/bar.py
and it finds another venv (or even don't find one), it will use that other one and start a separated ls for it.
obs. for curiosity, that find_cmd
is a slightly modified version of your function that used to there. It is defined here in my dotfiles. The one I pass to null-ls
is this one, which wraps that find_cmd
So, if that all makes sense, by doing that caching in before_init
or on_attach
we could just define the prefer_local
(as a list of possibilities would be awesome) which would get called right away to retrieve the command and a better cwd (if it find the local bin, of course)
What do you think? Like I said above, I'm no expert in this and you know your code much better than I ever will. Maybe my thoughts doesn't apply to null-ls for some reason that I can't see without further investigation.
Really appreciate the thoughtful feedback. For your first point, the issue that I ran into when first working on this is that command
is displayed and serialized in a few different places, so we need it to be a string once the source runs. Also, the current structure lets us validate that the resolved command is actually executable, which in my experience is a very common cause of issues. That's why I think using a different config key for dynamic commands might be the right call.
For your second point, wrapping before_init
or on_attach
to do buffer-specific things is a really interesting idea that I hadn't considered before, but I think that assuming that command
and cwd
are static for the lifetime of the buffer might be too inflexible. For example, I can imagine wanting to use a different command based on the presence of a config file or something else about the project's state that might change while the buffer is loaded. I think it's a sensible default for the prefer_local
option, though, and I agree that it could improve performance, especially for diagnostic sources that are spawned frequently. We could keep the current structure and maintain a cache table in the built-in's closure to achieve a similar result.
I can imagine wanting to use a different command based on the presence of a config file or something else about the project's state that might change while the buffer is loaded
Is this something common? Like, even in my case, I would understand opening a file and flake8
sticking to the system's binary even after I create a virtualenv inside the main dir, and to get it to use it I would need to "reload" neovim, or at least reload the LSP. Just like neovim, after I open it, if I modify my init.lua
file and add a mapping there I don't expect that it will register it unless I reload that source or close and open it again, but there might be some specific case that you have in mind that I'm not familiar with.
Having said that, either going full cache on before_init
/on_attach
or giving the possibility of those being changed on each run, if that means that each file can have its own command
/cwd
then the problem this issue is bringing to the table is totally solved!
Just to give my some more thoughts about the issue, maybe the caching thing can be decided like this (just some thoughts):
1) Someone uses the builtin directly, without providing a function to command or cwd. Then those can be cached at before_init/on_attach (cwd will basically be the found root_dir) and null-ls don't ever need to ask for them again
2) Someone uses the builtin but asks to prefer_local
as a list of possibilities. At before_init/on_attach null-ls checks for those and, if any of those match (giving priority to the order of the list) then that bin is used as command and the cwd where it was found is used as the cwd. Those are cached and null-ls don't ever need to ask for them again
Note here that 1 and 2 should, AFAIK, cover most user's use cases, including all that I pointed in this issue. And they should be very performatic since null-ls should not have to ask for them ever again.
Now for the extra cases:
3) If the user provides a cwd
function it will be used in preference to the root_dir cwd and also the prefer_local
cwd. null-ls asks for it at before_init/in_attach and don't ever need to ask for it again.
4) Just like for cwd
, providing a custom command
would do the same as above for it, caching it too.
5) Now, here comes the part where everything changes: A new option, cache_options
(or something like that), which could be initially set to true (I believe that most would want the cache) where, if one needs cwd
or even command
to be asked every time, than he could set this to false
and the function passed to one or both cwd
and/or command
would get called at all runs.
Obviously the command
one would need more time to really be implemented, like you said, but the cwd
one is just flagging it as it already acts like cache_options=false
when there's a function there.
That cache
could also be separated into parameters for each one, or even be a table mapping each kind (e.g. cache = {command: true, cwd: false}
)
Again, just some ideas! Feel free to ignore them. At the end of the day, adding the prefer_local
and retrieving the command and cwd (when it is not a function provided by the user) for it for each different file/buffer, than this issue is totally solved (with the small gain in performance by caching cwd at before_init/on_attach time). The rest are improvements that solves other issues and improves the overall solution, but doesn't prevent it from starting as simple as this.
Please tell me if you need any help testing or even with the development of this!
If it's of any use, I have a similar ish case (which overlaps with https://github.com/jose-elias-alvarez/null-ls.nvim/issues/270):
If a Ruby project has a Gemfile.lock
, I want to run RuboCop with bundle exec
but only if the Gemfile.lock
includes RuboCop. If it doesn't include it, I don't want to run RuboCop at all.
I sort of managed to achieve it like this:
nls.builtins.diagnostics.rubocop.with({
command = 'ruby-exec',
args = vim.list_extend(
{ 'rubocop' },
nls.builtins.diagnostics.rubocop._opts.args
),
runtime_condition = function(params)
return not rubocop_with_bundler(params.client_id)
end,
}),
nls.builtins.diagnostics.rubocop.with({
command = 'ruby-exec',
args = vim.list_extend(
{ 'bundle', 'exec', 'rubocop' },
nls.builtins.diagnostics.rubocop._opts.args
),
runtime_condition = function(params)
return rubocop_with_bundler(params.client_id)
end,
}),
Here rubocop_with_bundler
is a function that takes the LSP client ID (to get the root directory), then determines if bundle exec
should be used. The ruby-exec
command is just a small shell wrapper I use to ensure the correct Ruby version is used.
Since runtime_condition
is called every time the linter is invoked, I do make sure the result of the function is cached. This does lead to a bit of boilerplate, but it's manageable.
Probably the most annoying part is having to list the RuboCop linter twice in the sources list, rather than once with a dynamic command and arguments.
@bellini666 Thank you again for the feedback! I don't have a specific case for dynamic commands in mind, but working on this project has taught me that there is an incredible variety of use cases and project structures out there, so I prefer to keep things flexible and leave it to users / plugins to solve language-specific issues.
For now, I think it's best to keep things simple, and the simplest option I can think of is to have prefer_local
always cache command
and cwd
per buffer on the first run. Fundamentally, this is just a wrapper for functionality that users can already implement, so I don't think we need to handle every possible config option, but I'm always open to improving the user experience and we can definitely keep working on this in the future.
@YorickPeterse Maybe I'm misunderstanding, but would setting args
dynamically solve the issue?
nls.builtins.diagnostics.rubocop.with({
command = "ruby-exec",
args = function(params)
return vim.list_extend(
rubocop_with_bundler(params.client_id) and { "bundle", "exec", "rubocop" } or { "rubocop" },
nls.builtins.diagnostics.rubocop._opts.args
)
end,
})
@jose-elias-alvarez hahaha yea, I know what you mean :)
And that option seems like the best one for me too. It solves this issue and also improves the overall performance by caching command/cwd for people that have those defined.
@jose-elias-alvarez Dynamic arguments is something I found out about after writing my comment. It indeed cleans things up. However, it only works because my command
is a static wrapper. Without that, I'd need a dynamic command
as well.
Ah, I see now. That case should be handled by dynamic_command
(or whatever we end up calling it), which will also be added in the process of handling everything else mentioned here.
FYI lspconfig defaults to cwd == resolved_root for servers now
@bellini666 Could you (and anyone else interested in this) try #349?
Hey @jose-elias-alvarez ,
Just tested here and the selection of the local bin worked! But not the cwd.
To give a proper explanation, I have this structure here:
So, when I set prefer_local = ".venv/bin"
on flake8/black/isort they were retrieved fine from the .venv dir and they were used to run the commands instead of the global ones on some_file.py
. This worked perfectly!
Now, the cwd
they ran from was the root_dir
and not the backend
dir, which was the expected one and where the config files for those bins are.
So, for me that is the only thing missing! When the prefer_local
bin is found, use that cwd where it was found as the cwd to run the command. In that case, since prefer_local
is set to .venv/bin
the cwd would be backend/
.
@bellini666 Whoops, totally forgot about the cwd
issue. I just pushed another commit to the PR that should take care of it, I think. Let me know.
Hey @jose-elias-alvarez
Now it worked! :)
I have a small corner case, which will not affect me, but want to make sure to comment here. Take this as an example:
When I opened some_file.py
now it worked perfectly! cwd as set to backend and it used flake8 from .venv/bin directory.
Then I opened at another buffer the bar.py
file. It also worked perfectly as it used my global flake8 and the root_dir as cwd.
When exchanging buffers, each one uses the correct found binary (the global one for bar.py and the local one for some_file), which is correct! But after opening bar.py
, the cwd
changed not only for it, but also for from some_file.py
, so it started to use the wrong cwd too.
Like I said, it is a very specific corner case that doesn't actually affect me, but you might want to workaround that too.
Other than that, everything working really well! :)
@bellini666 Right, I see that we should also index cwd
by bufnr
to handle cases like this. I just pushed a commit that should help with that, too. Could you let me know if that takes care of it?
@jose-elias-alvarez when trying the last change I got this error (don't know how to get it as plain text):
@bellini666 Sorry, there was a typo when returning the cached command. Could you try again?
@jose-elias-alvarez np! :)
Now it is perfect! No remaining issues.
Hey @jose-elias-alvarez ,
Eagerly waiting for this to be released! Anything else that you need to test that I can help with?
@bellini666 Just merged it, thanks again for all the feedback and ideas! Let me know if you run into any issues.
@jose-elias-alvarez awesome, it is me that thank you for the attention :)
I'll surely let you know if I run into any issues
@jose-elias-alvarez btw, the change to prefer_local instead of having it running my custom function everytime made everything a lot faster. Like, to the point that I can notice it linting and formatting faster than before. Thank you very much for this!
@bellini666 Awesome, great to hear!
To @jose-elias-alvarez:
Hi, I'm using PDM. It will install pyright
, autopep8
, flake8
locally inside the folder myproject/__pypackages__/
, and I will have to prepend commands like pdm run autopep8 ...
(i.e. with pdm run ...
) to use these local packages. How should I set this up with null-ls
? My current idea is that I will have to modify some field of the setup { }
function to prepend the pdm run
to the cmd
(e.g. as indicated in the nvim-lspconfig
), but I found it very unclear on how to do so currently :/
[...] and I will have to prepend commands like
pdm run autopep8 ...
(i.e. withpdm run ...
) to use these local packages.
good news: I just resolved this with these settings, e.g. for flake8
:
diagnostics.flake8.with({
command = 'pdm',
args = { 'run', 'flake8', '--stdin-display-name', '$FILENAME', '-' },
timeout = 10000
}),
bad news: I have to save twice after the async linting:
I did try BufWritePre
and BufWritePost
, the former seems to be faster. But I still expect that the save will happen after everything is done.
null_ls.setup {
sources = sources,
on_attach = function (client, bufnr)
require('lsp_config.common_setup')(client, bufnr)
-- TODO: have to save twice for async formatting, since it's too slow.
vim.cmd("autocmd BufWritePre <buffer> lua vim.lsp.buf.formatting()")
end
}
@nyngwang By default vim.lsp.buf.formatting()
will not save the file after it runs, so what you are seeing is expected behavior, regardless of whether you're using BufWritePre
or BufWritePost
. You'd either have to override the handler or use something like this custom function to apply results and save.
@jose-elias-alvarez: I'm using tsserver
and eslint
(setup via nvim-lspconfig
), and their on-save behaviors are very smooth(it looks like there is only one save, but I'm not sure) without additional settings. Is it possible for null-ls to be as fast like them in the future? (Or at least the linked setup could be somehow simplified.)
@jose-elias-alvarez: I'm using
tsserver
andeslint
(setup vianvim-lspconfig
), and their on-save behaviors are very smooth(it looks like there is only one save, but I'm not sure) without additional settings. Is it possible for null-ls to be as fast like them in the future? (Or at least the linked setup could be somehow simplified.)
It's likely that something else in your config is affecting this, because absent other factors, the way you set up formatting in the snippet above should require saving twice if the file is actually formatted (once to trigger vim.lsp.buf.formatting()
and again to write any changes made). It's not really a speed issue, and I also don't plan on significantly deviating from the LSP formatting API.
Anyways, this is getting away from the original issue, so if we want to continue this discussion I think we should move it elsewhere.
Issues
Feature description
Hi,
Where I work we use a lot of monorepos, and those doesn't even contain the same language (e.g. in a project the
frontend
directory will be a typescript project while thebackend
will be python)The way null-ls works has 2 issues for monorepos like those:
1) Some linters/formatters needs to run with their
cwd
set to the directory where the configuration is defined. For example, inside thebackend
folder we have a.flake8
file which configures flake8, but if null-ls runs it from the root (the directory above, where .git is) it will not use those configs.I could get around this by passing a custom
cwd
function, but I'm not sure it is the proper way (I'll explain why bellow)2) The major issue: We need to run the specific installed linters/formatters inside those projects. For example, we have some
flake8
plugins installed in the project which will not be included in the global flake8 binary. Idially theflake8
installed in the project should be used.To workaround that I did the something like this:
What is the problem there? That
ROOT_DIRS.python
is defined like this:So, I'm depending that the python LSP runs first because it will find the properly root dir for that specific file in the buffer, so that when null_ls is going to retrieve the command, I already have the python root dir.
I cannot simply set the root dir to something else because each one will have its own root and its own formatters/linters (in my example, I want to run
flake8
,isort
andblack
iside thebackend
dir andprettier
andeslint
inside thefrontend
dir). I did the same for each source, I'm not showing everything here for simplicity =PIn summary, this works and I got everything running. But I'm afraid of that "if null-ls asks for the cmd first?" and also it doesn't seem to be a clean and elegand solution.
Maybe there's a way to improve that kind of workflow?
Help
No
Implementation help
No response