Open pelson opened 1 month ago
It seems to me there are two separate use cases related to symlinks (as well as #2684, which is related to canonicalisation of filenames, rather than symlinks).
In #2682, the use case is running /foo/bar/bin/python virtualenv.py env_name
, where /foo/bar/bin/python
is a symlink to /base_python/bin/python
(in particular, note that there's no Python environment in /foo/bar
, there's just the symlink in bin
). In that case, you have to resolve the symlink to find the actual base environment, where the lib
directory and other parts of the environment are situated.
In the case here, though, you are running /foo/bar/bin/python virtualenv.py env_name
, where /foo/bar
is itself a symlink to a full Python environment. In that case, the environment can be referenced either by its base name, or via the symlink /foo/bar/{bin,lib,...}
. For your use case, you want the symlink to be retained, to allow portability when the symlink is constant, but the linked-to environment changes.
To be honest, I don't think there's a clear "one size fits all" solution here. We could say that if the python executable itself is a symlink, then resolve it, but if a component is a symlink, then don't. But that seems fairly hacky, and prone to errors if there are situations we haven't considered.
My personal experience with symlinks is on Windows, where if an exe loads a DLL saved alongside it, a symlink won't run as the OS doesn't follow the symlink to find the DLL. So by analogy with that, I'd have been inclined to say that #2682 wasn't a bug, but was simply a consequence of how symlinks work when trying to find dependent files relative to the executable. But I'm clearly in a minority here, as #2682 was accepted and fixed.
I don't have a good answer here, except to say that we should probably follow the behaviour of the core venv
module, simply because in the absence of an obviously correct rule, consistency is key. Then, if anyone wants a change to the rules, they should start with venv
rather than here.
I don't have a good answer here, except to say that we should probably follow the behaviour of the core
venv
module, simply because in the absence of an obviously correct rule, consistency is key. Then, if anyone wants a change to the rules, they should start withvenv
rather than here.
This is interesting, I thought it was other way around and we do novel stuff here which venv
may or may not choose to incorporate. One such example is virtualenv
being way faster.
I'd say that behaviour should be consistent, but quality of life details (like performance, additional shell integrations) are what distinguishes virtualenv
. So venv
defines the base mechanism, we make it more user friendly.
I thought it was other way around and we do novel stuff here which venv may or may not choose to incorporate
I thought the same.
I don't have a good answer here, except to say that we should probably follow the behaviour of the core venv module, simply because in the absence of an obviously correct rule, consistency is key. Then, if anyone wants a change to the rules, they should start with venv rather than here.
Issues were filed in both cpython & virtualenv (with a link to the cpython issue). While the cpython issue is still opened, the virtualenv one was "fixed" introducing a regression in other use-cases.
If the aim of virtualenv
is to be bug to bug compatible with venv
(i.e. consistent behavior), which might differ depending on the python minor version or the python implementation, then so be it. In that case, the bug report template in virtualenv should probably be updated to reflect just that.
To be honest, I don't think there's a clear "one size fits all" solution here. We could say that if the python executable itself is a symlink, then resolve it, but if a component is a symlink, then don't. But that seems fairly hacky, and prone to errors if there are situations we haven't considered.
As of now, there are 3 identified use-cases related to symlinks that might be addressed by the hack mentioned (I don't know enough about the Windows one to be sure nothing else would be required).
IMHO, living with the status-quo of one or the other being broken does not seem beneficial to the ecosystem as a whole. At least those 3 are identified and can be unit-tested (but, per consistent behavior with venv
, this might have to wait on the cpython issue being addressed first).
Issue
When creating a virtual environment, the base prefix of the virtual environment should be the same as the base prefix of the Python that is creating the virtual environment - there should be no additional resolve steps with regards to symlinks.
This behaviour is seen in
venv
, and was implemented as expected until https://github.com/pypa/virtualenv/pull/2686, where symlinks were resolved.Why is this important?
In big (scientific) institutions, it is common to have a network mounted filesystem which can be access from all managed machines. This could be to mount a homespace, or to mount some data etc. (I've seen both). To scale this up, it is necessary to have multiple filesystem servers all using the same underlying storage. Machines are then clustered to point to different servers, but the user doesn't know which server a machine is talking to. When combining this with something such as autofs, you find that the server being contacted is in the path (e.g.
/nfs/some-machine/
), and to smoothen this out accross machines, the managed machines get a canonical symlink (e.g./project/{my-project}
which symlinks to the specific machine mountpoint). Essentially:With this setup, you can create a virtual environment in
/project/my-project
which works on both machines IFF the symlink is not resolved. This is the behaviour of CPython, and also the behaviour ofvenv
andvirtualenv<20.25.1
.Reproducer
Written in the form of a pytest (w validation against
venv
also):The result is a pass in
20.25.0
and a fail since:It is worth noting that the implementation of https://github.com/pypa/virtualenv/pull/2686 has a bug in the fact that the symlink at
dest_venv / 'bin' / 'python'
points tosys.base_prefix / 'bin' / 'python'
, and not the resolved symlink location ofsys.base_prefix
. Therefore thehome
value is inconsistent with the symlink that is created byvirtualenv
. The test validates this.Implications
There are two issues which https://github.com/pypa/virtualenv/pull/2686 closed:
To be honest, I'm not sure what https://github.com/pypa/virtualenv/issues/2682 is asking for. Perhaps is is a request for behaviour that is different for
venv
and also the std library (wrt.sys.base_prefix
) @mayeut.For https://github.com/pypa/virtualenv/issues/2684, I also don't fully understand the reason for this being a
virtualenv
issue (this is my problem, not a problem with the issue itself) - and somebody who knows what the correct behaviour should be would need to chime in (perhaps @ofek?).