Closed PhilipVinc closed 9 months ago
I'm curious of this. What would be needed to convince you that local .venv is better? (I'm not saying it is)
Maybe let me say something more about my setup.
I use direnv + pyenv to setup my projects.
In short, I have an .envrc
file in my project directories that specifies the version of python I want to use in this folder, and the name of the virtual environment (which I have to specify by hand). This normally all goes to a .venv
folder in the current workspace, but you can override a global env variable and set it up such that all such environments are located in a specific folder. I set it up such that it is located in ~/Documents/python_envs/{env-name}/{env-version}
.
cat .envrc
layout pyenv-venv 3.11.2 netket
This allows me to cd
into the workspace and direnv will activate the correct environment (and create it if needed). I then simply install dependencies with poetry, pip or another package manager.
That's where I would like to use a more advanced package manager such as rye to make this process smoother.
My necessity is keeping in sync my workspace among different computers with different architectures with minimal friction.
If there is a better approach than what I am doing that marries with rye
, I would be happy to know it, but I feel like there is no point in discussing whether I should connect to a machine remotely or whatnot.
By the way, about storing 'binary' blobs and installed packages not in the project folder but somewhere else, this is not terrible for isolation. This is exactly what Julia's package manager does, which is designed such all package data is kept in a global folder (~/.julia/depot
) and only the Project and Manifest files (in python language, only the pyproject and lock file) are kept in the working directory.
One thing I was considering was adding a way to have .venv
be a symlink to managed location. I am not sure if getting rid of .venv
is a good idea in general as it means that you need to consider different setups again which makes documentation and everything around it harder. But I do understand the general desire.
One way to square this would be to have a flag in the rye config that turns on out-of-tree virtualenvs where only a symlink is placed and the actual virtualenv is placed elsewhere.
A symlink would not be compatible with my personal setup, because I sync those folders with dropbox (which will kill the symlink). In general, I would prefer to keep the virtual environment outside of folders I sync on the internet but of course that's just my personal point of view.
Out of curiosity, what would be the added complication of having a global depot for the virtual environments, beside having to add and maintain a sort of rye activate
function that activates the local virtual environment ?
because I sync those folders with dropbox (which will kill the symlink)
Does Dropbox kill symlinks? I thought they are retained.
Out of curiosity, what would be the added complication of having a global depot for the virtual environments, beside having to add and maintain a sort of
rye activate
function that activates the local virtual environment?
If rye goes down the path of too many different incompatible setups it means every user, script interacting with a rye managed environment needs to start learning about all of those peculiarities. I'm not a huge fan of needing virtualenvs at all but the goal is to as much as possible have a consistent setup. I do realize though that it might be entirely impossible to avoid.
Why don't you try to get dropbox to ignore this folder?
I can't, because I have one such folder in every python project I work on (~several tens) and would have to do this on every device.
I use direnv + pyenv to setup my projects.
cat .envrc layout pyenv-venv 3.11.2 netket
Shouldn't it just be
layout pyenv 3.11.2
? Whypyenv-venv
? In any case,rye
removes the need ofpyenv
. FYI, as an alternative topyenv
there isasdf
orrtx
which usepyenv
under the hood for python but can be used for other languages/tools.
Alternatives to rye
that allows both centrally located environments and environments in the project root that you did not mention above are hatch
and pdm
. Unlike rye
though they don't install different python versions for you so you would still need something like pyenv
. Unlike Poetry and Hatch, but like rye
PDM is not limited to a specific build backend; users have the freedom to choose any PEP621 compliant build backend they prefer.
+1 for rye
to allow centrally located environments (perhaps as the default) and environments in the project root.
I'm closing this. I have no desire to add support for other locations into the tool in an attempt to standardize what I believe the correct behavior is (the one true location).
Just dropped by to request this same feature, for the same reason: I keep hundreds of my Python projects in a Dropbox folder (as insurance against someone stealing my laptop before I've run git commit
on them) and I use pipenv
to manage virtual environments purely because I want those environments to live outside of Dropbox, so I don't end up backing up hundreds of environments to Dropbox unnecessarily.
@simonw I wonder if it would make more sense if rye had an option where it automatically sets the necessary flags on .venv
to prevent synching. Eg: com.dropbox.ignored
in case of dropbox. Then you can leave those files where they are, but .venv
won't end up replicated to dropbox.
I would also appreciate this feature. For my set-up, I have limited storage in my $HOME directory, where we usually store code, as it is on a mounted disk not managed by me, and I usually store virtual environments on the local SSD of the machine. I also tried to move .venv
on the local SSD and then soft-link the real location to .venv
in the project directory, but I get
error: failed to canonicalize path `/home/.../.venv/bin/python`
Caused by: No such file or directory (os error 2)
error: could not write production lockfile for project
I would also be happy to not have this feature fully supported, and to manually symlink .venv
every time I sync a project for the first time, but this doesn't seem to work at the moment.
@dedeswim how do you do this with node_modules
?
Thanks for the reply.
I'm not a node user, so it's the first time I've faced this issue.
What I do with e.g., Conda, is to set CONDA_HOME to some place on the local SSD, but of course this is a bit different from what happens with rye.
I wonder how much of that type of usage is coming from the fact that people historically did it. Imagine you were never given that opportunity in the first place, would you still set up your computer this way? A lot of the performance benefits of modern tools require the virtualenv and the cache to be co-located on the same disk. Which in this case means that ~/.rye
and where your code lives should be on the same mount.
While we today do not leverage this much in rye, uv
already leverages this to some degree. So while there might be some flag in the future to auto relocate some backing in formation of a virtualenv to a different mount, it will always come with disadvantages.
Unfortunately I wasn't the one setting up the computer (as it is half-managed by the university I work at), but, if it were my choice, I agree that I would set things up so that everything would be on the same disk.
The advantage of the non-local disk where I usually store code is that it is synchronised across machines and is redundantly backed up. This makes it easy to jump between machines (in case resources on one are busy to run the experiments I need), and at the same time I don't need to worry about losing my code (in case I forget to commit). Instead, on the local disk I can store "disposable" files, like a virtual environment.
Anyways, thanks for your time thinking about this, I will think of another way to use rye even without this feature because I honestly really like it!
In case it matters, I unfortunately find rye
unusable for some of my setups where I have the project workdir mounted on a remote server over sshfs and then the .venv
realpath moves between local and remote executions :(
It would be really nice to support this feature.
@tomerha how do you deal with this in node?
@tomerha how do you deal with this in node?
I don't use node, just Python 😇
With Python, for now I use hatch
which works for me since the venv is external
I understand that it works, but supporting it has a lot of downsides requiring explicit configuration to discover the venv. It means that every tool needs to start learning where that venv is.
Since the same issue must exist in node they either must have found a solution there, the issue is not pressing enough or it’s unresolved and there must be some discussion around it.
What do you mean by every tool needs to start learning where that venv is.
?
Isn't it enough for rye to know where the venv is, and the rye's python shims are enough to get this sorted out? Or you mean that VIRTUAL_ENV
variable must also be correctly set..?
It's not enough if just rye knows where the venv is, because the goal is that you stuff "just works" across toolings. If we establish .venv
as the canonical location of a virtualenv, then any tooling can just magically automatically work for as long as the venv is in a synched state. This means editors can just rely on being able to auto discover it.
Today a lot of the bad DX from python comes from tools accidentally falling back to global python or requiring manual configuration for which virtualenv to use resulting in a lot of extra cost to be paid everywhere.
I'm not that familiar with npm or node, but from what I understood the similar issue is with node_modules
?
I didn't try myself but it looks like with node it can be manually worked around using bind mounts (see https://github.com/npm/npm/issues/7120#issuecomment-166586939).
However this solution doesn't work with rye since when the venv is created it tries to create a symlink to the Python interpreter and this fails since symlinks on sshfs don't behave as expected (e.g with -o follow_symlinks
they're translated to regular file mirrors from the server side).
Rye should not manipulate owners so in theory that step is not necessary to begin with. What however might be an issue is that the same trick will run into rye’s behavior to re-create the venv. That might require further changes.
Maybe the right way forward here would be to make an issue for precisely the situation we need to solve for and to see what solutions we can come up with.
@tomerha how do you deal with this in node?
FYI in node there is an alternative to npm, pnpm
and it solves this by creating symlinks inside myproject/node_modules
to ~/.pnpm/
Personally I think creating a .venv
symlink is good enough. There are plenty of sync scenarios besides dropbox. I use the unison file synchronizer and it has the follow
option that controls whether links are followed or copied as-is. That said, unison also has ignore directives, so I can just ignore .venv
.
For me the main rationale for using symlinks would be to be able to nuke all forgotten venvs, to free up disk space, in a central place once in a while... and also I like that I can du -sh ./myproject
and see the "true size" and not the including-venv-size.
I wish there was a way to specify a folder where rye should keep all virtual environments, and provide a simple command to allow activating that environment.
For example, this could be an env variable or configuration flag such as
and they could be stored in a directory names as
NAME-HHH
whereNAME
is the name of the project andHHH
is some hash of the path, for example. (I am aware that this means that changing the path of the project would make rye create a new venv... but I don't have great alternatives to mind).Why?
I keep all of my clones of repositories where I work on in a dropbox folder that is synced between my laptop and my workstation. That way I don't need to remind myself about committing my work every time I switch computer, and I have my stashes with me at all times.
However, my workstation and laptop have different architectures (Mac and linux) so I cannot keep the virtual environments in the same folder.