nteract / jupyter-paths

:city_sunrise: Pure JavaScript implementation of jupyter --paths --json
BSD 3-Clause "New" or "Revised" License
9 stars 11 forks source link

dataDirs() and configDirs() return incorrect data in virtualenvs with symlinked executables #32

Closed nils-werner closed 3 years ago

nils-werner commented 4 years ago

I believe guessSysPrefix() causes an issue in Hydrogen in virtualenvs where python is a symlink.

If you create two virtualenvs, one with symlinked and one with copied executables:

virtualenv --symlinks /path/to/venv_links
virtualenv --copies /path/to/venv_copies

and double-check the executables

ln -alh /path/to/venv_links/bin/python  # <--- this should be a symlink
ln -alh /path/to/venv_copies/bin/python

And then run

var jp = require('jupyter-paths')

console.log(jp.dataDirs());
console.log(jp.configDirs());
console.log(jp.runtimeDir());

in the two respective virtualenvs you will see two different results.

Two paths are different or missing in the linked virtualenv

source /path/to/venv_links/bin/activate
jupyter --paths
node test.js

which yields

config:
    /home/user/.jupyter
    /path/to/venv_links/etc/jupyter     # <---- this line is different in the JS output below
    /usr/local/etc/jupyter
    /etc/jupyter
data:
    /home/user/.local/share/jupyter
    /path/to/venv_links/share/jupyter   # <---- this line is missing in the JS output below
    /usr/local/share/jupyter
    /usr/share/jupyter
runtime:
    /home/user/.local/share/jupyter/runtime

and

[
  '/home/user/.jupyter',
  '/usr/etc/jupyter',                               # <---- this line is incorrect
  '/usr/local/etc/jupyter',
  '/etc/jupyter'
]
[
  '/home/user/.local/share/jupyter',
  '/usr/local/share/jupyter',                       # <---- there is a line missing here
  '/usr/share/jupyter'
]
/run/user/1000/jupyter

While the lines are correct in the copied virtualenv

source /path/to/venv_copies/bin/activate
jupyter --paths
node test.js

which yields

config:
    /home/user/.jupyter
    /path/to/venv_copies/etc/jupyter
    /usr/local/etc/jupyter
    /etc/jupyter
data:
    /home/user/.local/share/jupyter
    /path/to/venv_copies/share/jupyter     # <---- this line is important for kernel discovery
    /usr/local/share/jupyter
    /usr/share/jupyter
runtime:
    /home/user/.local/share/jupyter/runtime

and

[
  '/home/user/.jupyter',
  '/path/to/venv_copies/etc/jupyter',
  '/usr/local/etc/jupyter',
  '/etc/jupyter'
]
[
  '/home/user/.local/share/jupyter',
  '/path/to/venv_copies/share/jupyter',     # <---- this line is important for kernel discovery
  '/usr/local/share/jupyter',
  '/usr/share/jupyter'
]
/run/user/1000/jupyter

I believe this behaviour due to fs.realpathSync() in this line

sysPrefixGuess = path.dirname(path.dirname(fs.realpathSync(exe)));

It "escapes" the virtualenv by following the symlink to /usr/bin/python and then searching for data and config dirs in /usr and /.

captainsafia commented 4 years ago

Thanks for taking the time to research this, @nils-werner!

What do you think is a good approach for solving this? It seems like the catch-all else statement is probably working against us here.

Alternatively, the guessSysPrefix function has been giving us a lot of trouble in the nteract desktop app so we might consider replacing this part of our code.

nils-werner commented 4 years ago

I don't have too deep an insight into the remainder of the code, or how other OSes besides Linux behave here, but my first guess is that

 fs.realpathSync(exe)

is not really needed at all. To my understanding, this function does two things:

  1. it resolves . and .. to return an absolute path
  2. it resolves symlinks

My thought is: We probably need the first, but do we need the second? In some cases there are multiple interpreters with their version names present, where most of them are symlinks:

*python
*python3 -> python
*python3.6 -> python

In this case, resolving the symlink is not really necessary. These files are all in the same directory, and we're interested in the path.dirname() of the executable anyways. Besides that I cannot think of another scenario in which symlink resolution is necessary.

So if we allow to get rid of that, we could replace

fs.realpathSync(exe)

with

path.resolve(exe)

which only does ., .. resolution, but no symlink following.

captainsafia commented 4 years ago

Thanks for posting the explainer. I'm fine with replacing path.resolve over fs.realpathSync. The use of realPath has been in the codebase for around four years and I'm not sure if there is strong reasoning for it. @rgbkrk might be able to chime in here.

Absent that, we can cut a PR with this change then validate it locally on the nteract desktop app and the Atom extension.

Would you be interested in submitting a PR with this change so we can start testing it?

rgbkrk commented 4 years ago

No idea why the code was using realpath instead of resolve. I'll go check out your PR @nils-werner!

aminya commented 3 years ago

Closing this as it seems it is fixed.