Allow custom location for .venv

PhilipVinc commented 1 year ago

I wish there was a way to specify a folder where rye should keep all virtual environments, and provide a simple command to allow activating that environment.

For example, this could be an env variable or configuration flag such as

RYE_VIRTUALENV_DEPOT="~/Documents/rye-envs

and they could be stored in a directory names as NAME-HHH where NAME is the name of the project and HHH is some hash of the path, for example. (I am aware that this means that changing the path of the project would make rye create a new venv... but I don't have great alternatives to mind).

Why?

I keep all of my clones of repositories where I work on in a dropbox folder that is synced between my laptop and my workstation. That way I don't need to remind myself about committing my work every time I switch computer, and I have my stashes with me at all times.

However, my workstation and laptop have different architectures (Mac and linux) so I cannot keep the virtual environments in the same folder.

cnpryer commented 1 year ago

I'm curious of this. What would be needed to convince you that local .venv is better? (I'm not saying it is)

PhilipVinc commented 1 year ago

Maybe let me say something more about my setup. I use direnv + pyenv to setup my projects. In short, I have an .envrc file in my project directories that specifies the version of python I want to use in this folder, and the name of the virtual environment (which I have to specify by hand). This normally all goes to a .venv folder in the current workspace, but you can override a global env variable and set it up such that all such environments are located in a specific folder. I set it up such that it is located in ~/Documents/python_envs/{env-name}/{env-version}.

cat .envrc
layout pyenv-venv 3.11.2 netket

This allows me to cd into the workspace and direnv will activate the correct environment (and create it if needed). I then simply install dependencies with poetry, pip or another package manager. That's where I would like to use a more advanced package manager such as rye to make this process smoother.

My necessity is keeping in sync my workspace among different computers with different architectures with minimal friction.
If there is a better approach than what I am doing that marries with rye, I would be happy to know it, but I feel like there is no point in discussing whether I should connect to a machine remotely or whatnot.

By the way, about storing 'binary' blobs and installed packages not in the project folder but somewhere else, this is not terrible for isolation. This is exactly what Julia's package manager does, which is designed such all package data is kept in a global folder (~/.julia/depot) and only the Project and Manifest files (in python language, only the pyproject and lock file) are kept in the working directory.

mitsuhiko commented 1 year ago

One thing I was considering was adding a way to have .venv be a symlink to managed location. I am not sure if getting rid of .venv is a good idea in general as it means that you need to consider different setups again which makes documentation and everything around it harder. But I do understand the general desire.

One way to square this would be to have a flag in the rye config that turns on out-of-tree virtualenvs where only a symlink is placed and the actual virtualenv is placed elsewhere.

PhilipVinc commented 1 year ago

A symlink would not be compatible with my personal setup, because I sync those folders with dropbox (which will kill the symlink). In general, I would prefer to keep the virtual environment outside of folders I sync on the internet but of course that's just my personal point of view.

Out of curiosity, what would be the added complication of having a global depot for the virtual environments, beside having to add and maintain a sort of rye activate function that activates the local virtual environment ?

mitsuhiko commented 1 year ago

because I sync those folders with dropbox (which will kill the symlink)

Does Dropbox kill symlinks? I thought they are retained.

Out of curiosity, what would be the added complication of having a global depot for the virtual environments, beside having to add and maintain a sort of rye activate function that activates the local virtual environment?

If rye goes down the path of too many different incompatible setups it means every user, script interacting with a rye managed environment needs to start learning about all of those peculiarities. I'm not a huge fan of needing virtualenvs at all but the goal is to as much as possible have a consistent setup. I do realize though that it might be entirely impossible to avoid.

Zander-1024 commented 1 year ago

Why don't you try to get dropbox to ignore this folder?

PhilipVinc commented 1 year ago

I can't, because I have one such folder in every python project I work on (~several tens) and would have to do this on every device.

doolio commented 1 year ago

I use direnv + pyenv to setup my projects.
cat .envrc
layout pyenv-venv 3.11.2 netket
Shouldn't it just be layout pyenv 3.11.2? Why pyenv-venv? In any case, rye removes the need of pyenv. FYI, as an alternative to pyenv there is asdf or rtx which use pyenv under the hood for python but can be used for other languages/tools.

Alternatives to rye that allows both centrally located environments and environments in the project root that you did not mention above are hatch and pdm. Unlike rye though they don't install different python versions for you so you would still need something like pyenv. Unlike Poetry and Hatch, but like rye PDM is not limited to a specific build backend; users have the freedom to choose any PEP621 compliant build backend they prefer.

+1 for rye to allow centrally located environments (perhaps as the default) and environments in the project root.

mitsuhiko commented 9 months ago

I'm closing this. I have no desire to add support for other locations into the tool in an attempt to standardize what I believe the correct behavior is (the one true location).

simonw commented 9 months ago

Just dropped by to request this same feature, for the same reason: I keep hundreds of my Python projects in a Dropbox folder (as insurance against someone stealing my laptop before I've run git commit on them) and I use pipenv to manage virtual environments purely because I want those environments to live outside of Dropbox, so I don't end up backing up hundreds of environments to Dropbox unnecessarily.

mitsuhiko commented 9 months ago

@simonw I wonder if it would make more sense if rye had an option where it automatically sets the necessary flags on .venv to prevent synching. Eg: com.dropbox.ignored in case of dropbox. Then you can leave those files where they are, but .venv won't end up replicated to dropbox.

dedeswim commented 8 months ago

I would also appreciate this feature. For my set-up, I have limited storage in my $HOME directory, where we usually store code, as it is on a mounted disk not managed by me, and I usually store virtual environments on the local SSD of the machine. I also tried to move .venv on the local SSD and then soft-link the real location to .venv in the project directory, but I get

error: failed to canonicalize path `/home/.../.venv/bin/python`
  Caused by: No such file or directory (os error 2)
error: could not write production lockfile for project

I would also be happy to not have this feature fully supported, and to manually symlink .venv every time I sync a project for the first time, but this doesn't seem to work at the moment.

mitsuhiko commented 8 months ago

@dedeswim how do you do this with node_modules?

dedeswim commented 8 months ago

Thanks for the reply.

I'm not a node user, so it's the first time I've faced this issue.

What I do with e.g., Conda, is to set CONDA_HOME to some place on the local SSD, but of course this is a bit different from what happens with rye.

mitsuhiko commented 8 months ago

I wonder how much of that type of usage is coming from the fact that people historically did it. Imagine you were never given that opportunity in the first place, would you still set up your computer this way? A lot of the performance benefits of modern tools require the virtualenv and the cache to be co-located on the same disk. Which in this case means that ~/.rye and where your code lives should be on the same mount.

While we today do not leverage this much in rye, uv already leverages this to some degree. So while there might be some flag in the future to auto relocate some backing in formation of a virtualenv to a different mount, it will always come with disadvantages.

dedeswim commented 8 months ago

Unfortunately I wasn't the one setting up the computer (as it is half-managed by the university I work at), but, if it were my choice, I agree that I would set things up so that everything would be on the same disk.

The advantage of the non-local disk where I usually store code is that it is synchronised across machines and is redundantly backed up. This makes it easy to jump between machines (in case resources on one are busy to run the experiments I need), and at the same time I don't need to worry about losing my code (in case I forget to commit). Instead, on the local disk I can store "disposable" files, like a virtual environment.

Anyways, thanks for your time thinking about this, I will think of another way to use rye even without this feature because I honestly really like it!

tomerha commented 6 months ago

In case it matters, I unfortunately find rye unusable for some of my setups where I have the project workdir mounted on a remote server over sshfs and then the .venv realpath moves between local and remote executions :(

It would be really nice to support this feature.

mitsuhiko commented 6 months ago

@tomerha how do you deal with this in node?

tomerha commented 6 months ago

@tomerha how do you deal with this in node?

I don't use node, just Python 😇

With Python, for now I use hatch which works for me since the venv is external

mitsuhiko commented 6 months ago

I understand that it works, but supporting it has a lot of downsides requiring explicit configuration to discover the venv. It means that every tool needs to start learning where that venv is.

Since the same issue must exist in node they either must have found a solution there, the issue is not pressing enough or it’s unresolved and there must be some discussion around it.

PhilipVinc commented 6 months ago

What do you mean by every tool needs to start learning where that venv is.?

Isn't it enough for rye to know where the venv is, and the rye's python shims are enough to get this sorted out? Or you mean that VIRTUAL_ENV variable must also be correctly set..?

mitsuhiko commented 6 months ago

It's not enough if just rye knows where the venv is, because the goal is that you stuff "just works" across toolings. If we establish .venv as the canonical location of a virtualenv, then any tooling can just magically automatically work for as long as the venv is in a synched state. This means editors can just rely on being able to auto discover it.

Today a lot of the bad DX from python comes from tools accidentally falling back to global python or requiring manual configuration for which virtualenv to use resulting in a lot of extra cost to be paid everywhere.

tomerha commented 6 months ago

I'm not that familiar with npm or node, but from what I understood the similar issue is with node_modules?

I didn't try myself but it looks like with node it can be manually worked around using bind mounts (see https://github.com/npm/npm/issues/7120#issuecomment-166586939).

However this solution doesn't work with rye since when the venv is created it tries to create a symlink to the Python interpreter and this fails since symlinks on sshfs don't behave as expected (e.g with -o follow_symlinks they're translated to regular file mirrors from the server side).

mitsuhiko commented 6 months ago

Rye should not manipulate owners so in theory that step is not necessary to begin with. What however might be an issue is that the same trick will run into rye’s behavior to re-create the venv. That might require further changes.

Maybe the right way forward here would be to make an issue for precisely the situation we need to solve for and to see what solutions we can come up with.

phromo commented 5 months ago

@tomerha how do you deal with this in node?

FYI in node there is an alternative to npm, pnpm and it solves this by creating symlinks inside myproject/node_modules to ~/.pnpm/

Personally I think creating a .venv symlink is good enough. There are plenty of sync scenarios besides dropbox. I use the unison file synchronizer and it has the follow option that controls whether links are followed or copied as-is. That said, unison also has ignore directives, so I can just ignore .venv.

For me the main rationale for using symlinks would be to be able to nuke all forgotten venvs, to free up disk space, in a central place once in a while... and also I like that I can du -sh ./myproject and see the "true size" and not the including-venv-size.

astral-sh / rye

Allow custom location for .venv #371