Open luabud opened 1 year ago
Just as feedback, I personally have been using multi-root satisfactorily. A couple of bugs here and there in 1.5y of usage (one was about Python extension doing crazy stuff and hogging the machine for no apparent reason. Recreating the multi-root workspace fixed the problems) and now this problem with the Black extension that you are citing here above (but I believe it's not a problem with Python extension, that works correctly and provides autocompletion and everything correctly).
Happy to provide feedback, I use this at work on a daily basis :)
@rdbisme For our use case multi-root workspace don't work since we have tooling in our monorepo (Pants) which requires top-level files to be present in the root folder, to define "metadata" about how each project in the monorepo should function (e.g. specifying python virtual envs, building and running tests selectively). So our monorepo is not just a bunch of folders opened in a single VSCode window. If it was then there would be no issue and we'd gladly we multi-root workspaces.
See my post from which this issue is linked https://github.com/microsoft/vscode/issues/181845
Recently, I've published an Nx (monorepo tool) plugin @nxlv/python to enable python poetry projects in Nx workspaces, and the issue is very similar as @yegorski described in his post https://github.com/microsoft/vscode/issues/181845.
Each project has a virtual environment. The ideal scenario is the microsoft/vscode-python extension switch the virtual environment based on the sub-folder virtual environment without manually selecting the interpreter or changing the settings.json
file.
Different from @yegorski use case, I can use project.code-workspace
and add a .vscode/settings.json
for each Python project, but that's not the ideal scenario, and might be not a good configuration if you have a Nx workspace with Python and Node.js projects in it.
I agree with @yegorski post, if the extension supports the .vscode/settings.json
on each sub-folder would be awesome.
We are experiencing similar issues as @lucasvieirasilva and @yegorski. It boils down to the following in our case:
I need to programatically set up virtual-environments and assign workspace interpreters for users to simplify onboarding. As far as I know, there is no API exposed to update the Workspace Python Interpreter setting. Manually updating the SQLite-database in which these settings are stored feels does not seem like a good solution, given that everything is stored under the ms-python
key as a JSON blob. If it were more cleanly separated I would feel more comfortable about doing this without fear of corrupting the rest of the user's state. However, an optimal solution would not require altering this database at all.
If we create a separate .vscode/settings.json
in each workspace-folder we experience issues with Intellisense not picking up the correct interpreter without closing and re-opening files. The button used to select an interpreter will say that no interpreter is currently selected until I close and re-open the file in question. Compared to setting the Workspace interpreter(s) we also get terrible performance with this solution. On a medium-to-large sized codebase we have to wait almost a full minute before we can start following references etc. I have to repeat this waiting time for each new project I open files in. It would be more clean, and more maintainable if the python-extension supported assigning interpreters for sub-folders via a single settings.json
file at the root of the project.
We, too, have additional meta-files and tools at the top-level and need these to be visible in the monorepo. As a workaround I have created the following structure:
º base-folder
º project-folder-1
º project-folder-n
º libs-folder-1
º libs-folder-n
This works but the structure of base-folder
is similar to this:
base_folder/
meta_file_1
meta_file_n
libs/
lib-1/
lib-n/
projects/
project-1/
project-n/
Meaning that all the project and library folders have to be repeated in the Workspace below the base-folder. Opening files via Ctrl + P
will open the actual folder, not the directory in the base-folder. Therefore I can't add these and just keep them closed at all times to prevent clutter. There are a few solutions that could help alleviate these issues:
Make it possible to add additional folders to workspaces, set their interpreter and then hide them from the workspace. From my own testing it appears that the interpreter changes are still picked up correctly when navigating to individual projects inside the base-folder in the workspace. However, I have not been able to hide the folders added via the "Add Folder to Workspace..." to remove clutter, even using third-party extensions.
Make it possible to assign a folder as a workspace, giving us the benefit of selecting a separate interpreter etc. while keeping the directory structure otherwise intact.
Allow users to create meta-structures inside a workspace. If the solution above is not feasible, our experience would also be improved if we could create a workspace similar to:
º base-folder
º projects/ (this is a meta-folder, for structure only)
º project-1
º project-n
º libs/ (this is a meta-folder, for structure only)
º lib-1
º lib-n
I just wanted to start by thanking you all so much for taking the time to provide more context and feedback around your experiences with mono repos, it's super appreciated!
I don't often work with mono repos so I'm trying to get a better understanding of the typical workflow involved. Please correct me if I'm wrong, but from what I'm understanding there are two common approaches to how folks have been managing mono repos:
If a mono repo is set up in a way that one can use a shared virtual environment (i.e. there's no dependency conflicts between the projects in the mono repo), my understanding is that the current experience is good enough as one can simply open the root/base folder in VS Code, create a virtual environment on the project root and install all the dependencies on that same venv. The extension can automatically activate that venv and all actions can be performed inside it.
When there's a mono repo with individual virtual environments per project (because e.g. their dependencies can conflict with each one), but there's nothing on the root directory that is relevant/needed for all projects in the mono repo, then multi-root workspaces works well enough (except for the occasional bugs as called out by @rdbisme 😅). The extension will switch the selected interpreter to match the corresponding virtual environment inside the workspace where the file you open lives.
The problem is really when the root folder has metadata or files that are needed for the entire project, and each one of the projects have their own virtual environment. Ideally, you want to stay at the project root level and perform any/all actions from there, so opening the projects as a workspace is not ideal. One could do as @Moortiii suggested and open the root folder as a workspace and then add the remaining project folders as their own workspaces. There are a couple of issues with that:
"files.exclude"
to hide the project folders (which would be the opposite of what @Moortiii wants, I guess)Are there any other issues that I'm missing?
One more question that showed up as we discussed about this issue today: IntelliSense seems to be a key feature that is missing when opening a mono repo as a single folder in VS Code. Are there any other features that people would also like to use but can't because the the Python extension doesn't switch to the right environment? Or is it really just IntelliSense?
I think it's just IntelliSense for us. There might be other python features. I guess ultimately the success criteria is we can run the apps and debug tests in the correct virtual env.
I think IntelliSense support alone would suffice for us as well. However, I wanted to clarify that opening each project as its own workspace folder is far from an ideal solution. In a perfect world I would be able to open the root folder of my monorepo (not as a workspace) and assign separate interpreters to each subfolder as I please.
I also think it's important to ensure that this is easy to configure and can be done programatically. Otherwise it is difficult to guarantee that all contributors to a monorepo are using the same setup and can start working right away.
Regarding dependency management and single interpreter vs. multiple interpreters I can chime in with my own experience. In a large project I find that it's easy to end up in a situation where transitive dependencies conflict with one another. Sometimes this is workable, but it can also lead to a situation that cannot be resolved without removing or downgrading certain dependencies. In the worst case this could prevent us from installing important security patches.
It also conflicts with the general strategy I've chosen where the monorepo will still build and version individual images which are composed to a larger platform. Service A will never depend on Service B, and as such I find it weird to introduce a strict limitation that their individual dependencies must be compatible at all times when this is not the case at runtime.
Iuabud writes:
1. Using one shared virtual environment 2. Using individual virtual environments per project/subfolder
If a mono repo is set up in a way that one can use a shared virtual environment (i.e. there's no dependency conflicts between the projects in the mono repo), my understanding is that the current experience is good enough as one can simply open the root/base folder in VS Code, create a virtual environment on the project root and install all the dependencies on that same venv. The extension can automatically activate that venv and all actions can be performed inside it.
We are in situation (1) above, and everything works fine with regards to the venv... but that doesn't end up being sufficient.
We have a common module (called, simply enough, "common") with code used by all the other modules (call them "A", "B", "C", etc).
We have installed "common" to the venv - for local building in all the other modules - but in the editor, we find that chasing down function definitions and the like takes us to the code installed in venv/lib/site-packages instead of to the common module in our code.
Maybe we have set it up wrong somehow, but I've been chasing documentation for a while now, and haven't found a better way. If I've missed something, I'd love to hear about it; if not, then this is another thing that one should be able to get working in a mono-repo.
@nkronenfeld have you tried installing the common
module as an editable package?
That would be like pip install -e ./common
rather than pip install ./common
. Doing that will create a link in your site-packages to the local source rather than (essentially) copying the files over. Once the files are copied over there's not really any way for an editor to know that the local source and the installed source are the same thing (and indeed they may in fact not be the same).
Pyright (the type checker upon which pylance is built) includes support for an alternative way to work with monorepos. The feature is called "execution environments", and it's documented here. You can configure it using a pyrightconfig.json
or pyproject.toml
file within your monorepo's root directory. When using execution environments, you can work with a single-root VS Code workspace but specify different subdirectories within your project that represent different "executables". Each execution environment can have a different pythonVersion
, pythonPlatform
and extraPaths
. The feature assumes that there is a shared venv for the entire monorepo, which is an assumption that may not hold for some teams.
I'm curious whether this feature addresses some or all of the issues you have with the multi-root workspace approach. If so, it's perhaps something that pylance could further build upon as a solution for monorepos.
Both of these approaches (execution environments and multi-root workspaces) have pros and cons, and neither is a perfect solution for all use cases.
Sorry, I've been waiting to answer until I had time to sort everything out (which I still haven't).
Installing common as an editable package seems to have worked... but has messed up our build in other ways that I still have to sort out. I think they aren't major problems, I just haven't had time to deal with them lately, because it's been good enough. I have been able to follow code back to the correct source in VSCode, and I have been able to run without separately rebuilding common every time - both of which are big wins. So @PeterJCLaw thank you very much for that.
Execution environments also look like a good match - I will take a look at them too. Thank you too for that, @erictraut
We are using a monorepo with individual poetry venvs for each project, and editable installs. The experience has been pretty great, setting "python.defaultInterpreterPath": "./.venv/bin/python"
in the workspace file ensures that the correct interpreter loads for the correct python module. There is/used to be a bug where pyright would load two linters/typecheckers per python file, one from the project folder and one from the root folder, creating weird flickering squiglies in the code. Disabling pyright for the root folder fixed that.
There is one minor detail that would be nice to improve, and that is the name of the interpreters when they are listed by vscode. Changing the interpreter in a notebook currently looks like this
Which is not super helpful. Would it be possible to include the name of the venv, like the terminal prompts do?
We'd love to see improvements in this area as well. We're using a monorepo with editable package installs. We use one project folder and venv per package (i.e. we don't add the root itself, just each package, as a project folder).
I would describe the IntelliSense behavior we see as erratic/nondeterministic. Sometimes it works and then stops working, without us changing the venvs, the "selected interpreter" settings in VSCode, or anything else I could reasonably imagine should affect IntelliSense.
Restarting VSCode often fixes the problem. Sometimes re-opening the file fixes the problem. Sometimes "Clear all interpreters" followed by resetting them again for each folder works.
It's really hard to pinpoint the cause because, as I said above, we get changing behavior without changing any configuration. I suspect it might be related opening a file in one project directory followed opening a file in another (thus "switching" interpreters).
Junior dev here. Just came across this thread and thought I would throw in my limited experience with monorepos and VSCode.
TLDR: I think vscode should add some way to modify/influence venv discovery.
I started a sample monorepo just testing stuff out to see how it would work. I haven't set up Pants or any other monorepo specific tools. I'm using Makefiles for managing builds, and setting up venvs.
Seems some people in this thread ran into issues with managing virtual environments in subdirectories. I ran into issues with that too but found some solid workarounds that seem pretty flexible to me. The only tool I found I really needed was the Python venv manager. https://marketplace.visualstudio.com/items?itemName=donjayamanne.python-environment-manager
The extension makes switching the environments effortless. Getting the MS Python extension to detect your environments in subdirectories requires that you enter the path to the interpreter in the venv's bin as the MS Python extension won't detect environments in subdirectories AFAIK. Once the environment is detected though it can be easily switched to and intellisense immediately works with the selected environment.
Environment discovery is definitely one of the places where I really feel there could be some easy improvements on VSCode's side. A workspace/user scoped configuration option to direct the python extension to look in specific directories would make the experience much better IMO.
One other thing I noticed while writing this is that some of my venv's that I pointed to previously have disappeared from the Environment Manager's list. They seem to disappear after restarting VSCode unless you have the virtual environment active in some way, i.e. in a terminal. My guess is that the extension doesn't keep track of virtual environments across restarts and only displays what the MS Python extension is aware of. Oh well it's definitely not perfect.
Cheers for the pointer to this issue @luabud. We've got a similar setup to a few people above: separate virtual environments per subfolder, using poetry and with editable installs.
Using multi-root didn't work out great for us because of the file duplication issue (when opening root and also subfolders), but also the need to manually add each subfolder to the workspace isn't ideal when you have lots of subfolders that change often and lots of people on the team.
I'm using virtualenvs.in-project, so scanning for closest venv up the directory tree would work in my case, to find which venv should be activated for each active file. A subfolder specific settings.json
would work too.
I've put this simple extension together to do exactly what @thomelane suggests, so please let me know if this solves the problem and feel free to contribute to it (https://github.com/teticio/python-envy). https://marketplace.visualstudio.com/items?itemName=teticio.python-envy
On the Pylance/Pyright side, I filed something related around walking upwards to find the nearest configuration for the analyses: https://github.com/microsoft/pylance-release/issues/5564
Otherwise, I think the scenarios described above are similar issues which I ran into where it actually pushed me towards using one Python project/dependency manager over another given its ability to produce a top-level workspace .venv
- something I mind less than having a top-level pyproject.toml
.
Still, I think that because the .venv
directory is not co-located with the pyproject.toml
, the Python extension might not immediately suggest using it upon creation.
It would be nice to be able to explore the tests on the sub-module I'm working on similar to how @teticio's python-envy works. By updating "python.testing.pytestArgs": ["..."] to the root of the current module
@alita-moore That's an interesting idea. Would it make sense to add (as an option) to python-envy? Would it just set python.testing.pytestArgs
to the same directory where it activates the .venv
? So the structure of your monorepo would be having a tests
directory and a .venv
directory in every sub-module directory? Let me know and I can make the changes.
@teticio yeah i think it makes sense to add to python-envy (awesome package, btw, thank you very much). Yeah it would set the python.testing.pytestArgs to the root directory of where .venv is so for example if you had a project structure of
-- folder --- .venv --- src ---- something.py
Then when you focus something.py it would update the python.testing.pytestArgs
to reflect the path to the folder
(in this case just folder
. Where perhaps it initially started out as a .
or the root directory of vscode.
Note that I actually use individual test directories in my code / at the same level of the code it's testing, so I think having it not rely on a top-level test folder would be great. Thank you!
hey all, just wanted to give an update that we are working on a prototype that would allow a better experience for mono repos, such as allowing interpreters to be associated to folders and individual files, without relying on multi-root workspaces. No ETAs but hopefully this will be available soon :) But I also wanted to link a somewhat related issue from upstream VS Code here: https://github.com/microsoft/vscode/issues/32693
Just noticed that the notebook env selection now shows the paths so they can be distinguished per submodule. Huge thanks!
When using pyenvy if you switch between multiple interpreters quickly it sometimes causes the cpu to throttle.
We've recently started trying a monorepo approach in vscode with mixed results. We're using a rye-style approach with a single venv that the packages share, each installed into it as editable.
If we use a non-multiroot workspace, i.e. just opening the monorepo at the root directory and letting the tooling work from there, most things work very well. Refactoring, import suggestions, and all of that work as well as we could hope. The primary problem we run into in this configuration is the test explorer. Some of our test modules have the same name, e.g. test_foo.py
shows up in the test suites for multiple packages, and this causes pytest to complain. There's a tension, then, between a) treating the entire body of code as one for some operations like refactoring, and b) treating the packages independently for others, like test discovery and running. Ideally, the test explorer would be able to handle several independent test suites, but I haven't found a way to do that yet. (And while we could rename the conflicting test modules, this seems like putting the onus on the wrong party to fix the problem.)
The other approach we've tried is using multi-root workspaces, with each of our packages added as a folder to the workspace. In this configuration, the testing problem is fixed; the test explorer treats each package's tests separately, so naming conflicts go away. Unfortunately, we lose all of the benefits of a single language server handling all of the code. Things like refactoring and import suggestions simply don't work across packages. My understanding is that much of this is due to the fact that there's a separate language server for each "folder", and they don't cooperate in any way.
Anyhow, hopefully that's another useful data point as you work on the tooling.
If we use a non-multiroot workspace, i.e. just opening the monorepo at the root directory and letting the tooling work from there, most things work very well. Refactoring, import suggestions, and all of that work as well as we could hope. The primary problem we run into in this configuration is the test explorer.
Could you ellaborate a bit on how you make this setup work? I tried the same i.e. having a root folder that I opened in VSCode and inside that folder I have a few folders containing different Python packages, using a single venv and editable dependencies. In my case no test cases show up whatsoever.
Could you ellaborate a bit on how you make this setup work?
The main trick is to tell pytest where to find the tests. Following the rye approach, have a top-level pyproject.toml
with something like this:
[tool.pytest.ini_options]
addopts = ["--import-mode=importlib"]
testpaths = [
"packages/app/tests",
"packages/domain/tests",
"packages/infrastructure/tests",
"packages/web/tests",
]
Each of 'app', 'domain', 'infrastructure', and 'web' are Python packages with their own pyproject.toml
and their tests in the associated 'tests' directories. This seems to tell vscode where everything is.
Incidentally, I mentioned that my main problem with this approach was test name collision. The --import-mode=importlib
bit makes this problem go away, and is the recommended way of importing tests in any event. So this setup is working very well for us.
@abingham I've found that a multi-root monorepo using poetry's relative packages + pylance retains refactoring and type checking across projects, have you tried this? see https://github.com/microsoft/pylance-release/issues/5995 for a link to a codesandbox that demonstrates this workflow.
Have you tried this? The biggest drawback I've found using it is that pylance starts to bug out once you get more than like 3-5k files in the monorepo and pylance switching between project interpreters can be slow and sometimes bug prone like in the case of a jupyter notebook as discussed in the linked issue.
@abingham I've found that a multi-root monorepo using poetry's relative packages + pylance retains refactoring and type checking across projects, have you tried this? see microsoft/pylance-release#5995 for a link to a codesandbox that demonstrates this workflow.
Could you elaborate on this? We also have a multi-root mono repo and e.g. refactoring doesn't work across folders. Did you mean this codesandbox link?
Our setup looks like follows (with each top-level directory as a VSCode multi-root folder):
pkg_a/
pyproject.toml
pkg_b/
pyproject.toml
We use a "relative" dependency e.g. {path = "../pkg_b", develop = true}
in pkg_a/pyproject.toml
. I've tried to have a virtualenv at the top-level or inside pkg_a
. What's your setup? Do you have some kind of top-level pyproject.toml?
Yes, that's the correct CodeSandbox link.
For each project, we use a separate pyproject.toml file, following the same relative dependency structure you mentioned. We do not have a single top-level pyproject.toml. Are you using Pylance?
In the CodeSandbox you linked, refactors will propagate across projects if you open it in VSCode, except for Jupyter notebooks.
Please note that we use Python Envy to switch the virtual environment to the one associated with the currently open file.
For each project, we use a separate pyproject.toml file, following the same relative dependency structure you mentioned. We do not have a single top-level pyproject.toml. Are you using Pylance?
I am using Pylance. My issue was using a venv in each package directory. I guess that made VSCode switch venv when I opened e.g. pkg_b
and that venv didn't include pkg_a
and thus cross-package stuff didn't work (e.g. I could find all references of identifiers in pkg_b
).
It seems that the best approach for a multi-root monorepo is to have to have as few virtual envs as possible (e.g. only for the "top-level" stuff, which depends on other packages). We can't have just one, because of dependency conflicts (that's why we have multiple packages to begin with), but fewer definitely seems better.
We talked to a lot of folks at PyCon that were complaining about our support for mono repos. This issue is to track feedback about this experience. Current hypothesis is that our multiroot support isn't great, so solving that might solve the issue for mono repos.