pyutils / line_profiler

Line-by-line profiling for Python
Other
2.56k stars 118 forks source link

Normalize path before comparison #261

Closed amirebrahimi closed 2 months ago

amirebrahimi commented 2 months ago

Fixes issue #152

The is_ipython_kernel_cell was failing on my local machine because my path was:

C:\Users\USER~1\AppData\Local\Temp/ipykernel_12048/3575731997.py

and os.path.join(tempfile.gettempdir(), 'ipykernel_') was returning:

C:\Users\USER~1\AppData\Local\Temp\ipykernel_

With this change paths will now be normalized.

Erotemic commented 2 months ago

I'm wondering if it makes sense to test both cases:

    norm_filename = os.path.normpath(filename)
    return (
        filename.startswith('<ipython-input-') or
        filename.startswith(os.path.join(tempfile.gettempdir(), 'ipykernel_')) or
        filename.startswith(os.path.join(tempfile.gettempdir(), 'xpython_')) or
        norm_filename.startswith('<ipython-input-') or
        norm_filename.startswith(os.path.join(tempfile.gettempdir(), 'ipykernel_')) or
        norm_filename.startswith(os.path.join(tempfile.gettempdir(), 'xpython_'))
    )

Is the patch with only the normed variant equivalent? I.e. are all of the following false?

Will the above be false on any platform? I think the answer is yes, but if you could double check these cases.

It looks like there is always a windows CI failure, not sure if its on my side or not yet.

amirebrahimi commented 2 months ago

I don't think it is necessary to check both cases.

Is the patch with only the normed variant equivalent? I.e. are all of the following false?

  • Will normpath ever mess with a string starting with <ipython-input- ?
  • Will tempfile.gettempdir() ever be a non-normed version of a path?

For the first two:

import os
os.path.normpath("<ipython-input-")
Out[3]: '<ipython-input-'
import tempfile
tempfile.gettempdir()
Out[5]: 'C:\\Users\\USER~1\\AppData\\Local\\Temp'
os.path.normpath(tempfile.gettempdir())
Out[6]: 'C:\\Users\\USER~1\\AppData\\Local\\Temp'

For the latter, the docs will take a directory from an environment variable, so I checked whether tempfile.gettempdir will normalize the path separators even if I set something different, which it does.

(line-profiler-bug-py3.9) C:\Users\USER\dev\line_profiler_bug\line_profiler>set TMPDIR=C:/tmpdir

(line-profiler-bug-py3.9) C:\Users\USER\dev\line_profiler_bug\line_profiler>echo %TMPDIR%        
C:/tmpdir

(line-profiler-bug-py3.9) C:\Users\USER\dev\line_profiler_bug\line_profiler>python
Python 3.9.12 (tags/v3.9.12:b28265d, Mar 23 2022, 23:52:46) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import tempfile
>>> tempfile.gettempdir()
'C:\\tmpdir'
>>> os.path.normpath(tempfile.gettempdir())
'C:\\tmpdir'
>>>
  • Will os.path.join ever return a pattern that normpath might destroy?

The docs for os.path.normpath mention that "string manipulation may change the meaning of a path that contains symbolic links". So, I did a test with a file, world, that symlinks to another file hello. On my machine it does not resolve the symbolic link. So, I don't think this is an issue.

(line-profiler-bug-py3.9) C:\Users\USER\dev\line_profiler_bug>type hello
hello world

(line-profiler-bug-py3.9) C:\Users\USER\dev\line_profiler_bug>type world
hello world

(line-profiler-bug-py3.9) C:\Users\USER\dev\line_profiler_bug>python
Python 3.9.12 (tags/v3.9.12:b28265d, Mar 23 2022, 23:52:46) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path.normpath("./hello")
'hello'
>>> os.path.normpath("./world") 
'world'
>>> os.getcwd()
'C:\\Users\\USER\\dev\\line_profiler_bug'
>>> os.path.join(os.getcwd(), "hello")
'C:\\Users\\USER\\dev\\line_profiler_bug\\hello'
>>> os.path.join(os.getcwd(), "world") 
'C:\\Users\\USER\\dev\\line_profiler_bug\\world'
>>> os.path.normpath(os.path.join(os.getcwd(), "world"))
'C:\\Users\\USER\\dev\\line_profiler_bug\\world'
>>>

Will the above be false on any platform? I think the answer is yes, but if you could double check these cases. I have Ubuntu 22.04.2 LTS installed on WSL2, so I tried the same examples:

Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path.normpath("<ipython-input-")
'<ipython-input-'
>>> import tempfile
>>> tempfile.gettempdir()
'/tmp'
>>> os.path.normpath(tempfile.gettempdir())
'/tmp'
>>>

and

USER@P1-PF3HX471:~$ echo "hello world" > hello
USER@P1-PF3HX471:~$ ln -s hello world
USER@P1-PF3HX471:~$ cat hello
hello world
USER@P1-PF3HX471:~$ cat world
hello world
USER@P1-PF3HX471:~$ python3
>>> import os
>>> os.path.normpath("./hello")
'hello'
>>> os.path.normpath("./world")
'world'
>>> os.getcwd()
'/home/USER
>>> os.path.join(os.getcwd(), "hello")
'/home/USER/hello'
>>> os.path.join(os.getcwd(), "world")
'/home/USER/world'
>>> os.path.normpath(os.path.join(os.getcwd(), "hello"))
'/home/USER/hello'

So, the behavior looks to be the same.

Since the fix is mainly to normalize path separators, if you prefer, we can use os.path.normcase instead of os.path.normpath.

Erotemic commented 2 months ago

Thank you for the analysis (and the PR!). Normcase seems safer to me. I've been burned by normpath corner cases in the past. My intuition is that either is probably fine, but I err on the side of caution when making changes to this library given that it's widely used.

It looks like the CI error is happening on my latest branch as well. I'll want to get that fixed before merging in anything, but overall switching to normcase looks good to me (if you want to accelerate merger of this branch, any help with debugging the CI error would be appreciated, but if not I will get to it eventually).

amirebrahimi commented 2 months ago

Thank you for the analysis (and the PR!). Normcase seems safer to me. I've been burned by normpath corner cases in the past. My intuition is that either is probably fine, but I err on the side of caution when making changes to this library given that it's widely used.

It looks like the CI error is happening on my latest branch as well. I'll want to get that fixed before merging in anything, but overall switching to normcase looks good to me (if you want to accelerate merger of this branch, any help with debugging the CI error would be appreciated, but if not I will get to it eventually).

Looks like this is fixed in a later version: https://github.com/pypa/cibuildwheel/issues/1740

amirebrahimi commented 2 months ago

Thoughts, @Erotemic? I think if you approve the workflow change that it will pass the build.

Erotemic commented 2 months ago

Looks like CI needs fixes. Almost certainly they are unreleated to this, but I'm reluctant to merge until I see a green dashboard.

amirebrahimi commented 2 months ago

Looks like a few more bugs were fixed with cibuildwheel, so I updated the version. Can you run the workflow again, @Erotemic? image

amirebrahimi commented 2 months ago

Okay, so looks like this issue is occurring in other PRs, too: https://github.com/pyutils/line_profiler/pull/256 Not sure what to do to fix it atm.

Erotemic commented 2 months ago

I will eventually look into this if nobody else does. This is one of many open source projects I maintain, so I have less time to delve into details than I would like.

Perhaps I can summon @joerick and ask if this pytest error is familiar?

  ________________________ ERROR collecting test session _________________________
  /private/var/folders/c9/jqpw9nhs7jj7vm5185nyw05h0000gn/T/cibw-run-ntpjv04u/cp38-macosx_x86_64/venv-test-x86_64/lib/python3.8/site-packages/pluggy/_hooks.py:513: in __call__
      return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  /private/var/folders/c9/jqpw9nhs7jj7vm5185nyw05h0000gn/T/cibw-run-ntpjv04u/cp38-macosx_x86_64/venv-test-x86_64/lib/python3.8/site-packages/pluggy/_manager.py:120: in _hookexec
      return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  /private/var/folders/c9/jqpw9nhs7jj7vm5185nyw05h0000gn/T/cibw-run-ntpjv04u/cp38-macosx_x86_64/venv-test-x86_64/lib/python3.8/site-packages/_pytest/python.py:212: in pytest_collect_directory
      if pkginit.is_file():
  /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/pathlib.py:1439: in is_file
      return S_ISREG(self.stat().st_mode)
  /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/pathlib.py:1198: in stat
      return self._accessor.stat(self)
  E   PermissionError: [Errno 13] Permission denied: '/private/var/agentx/__init__.py'
  =========================== short test summary info ============================
  ERROR ../::private::var - PermissionError: [Errno 13] Permission denied: '/private/var/agentx/__init__.py'
  =============================== 1 error in 0.17s ===============================
  Restoring cwd = '/private/var/folders/c9/jqpw9nhs7jj7vm5185nyw05h0000gn/T/cibw-run-ntpjv04u/cp38-macosx_x86_64/test_cwd'
joerick commented 2 months ago

I don't recognise it, sorry!

Erotemic commented 2 months ago

I've started to look into the CI issue. I think it is an issue in pytest, which I've noted here: https://github.com/pytest-dev/pytest/issues/11904

Erotemic commented 2 months ago

I think I fixed (worked around) the issue by pinning pytest to 7.4.4 in cibuildwheel tests. The tests outside of cibuildwheel still use the latest pytest, but they don't seem to cause issues outside of cibuildwheel.

If you rebase on main the tests will likely pass.

Erotemic commented 2 months ago

I attempted to rebase this branch, but somehow I ended up clobbering it and causing it to close. I opened a new PR https://github.com/pyutils/line_profiler/pull/264 with the correct rebased version