python / cpython

The Python programming language
https://www.python.org
Other
63.37k stars 30.33k forks source link

Unexpected exception with zip importer #89346

Closed ronaldoussoren closed 3 years ago

ronaldoussoren commented 3 years ago
BPO 45183
Nosy @Yhg1s, @brettcannon, @ronaldoussoren, @pablogsal, @miss-islington
PRs
  • python/cpython#28435
  • python/cpython#28437
  • python/cpython#28438
  • Files
  • repro.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/brettcannon' closed_at = created_at = labels = ['type-bug', '3.10', '3.11'] title = 'Unexpected exception with zip importer' updated_at = user = 'https://github.com/ronaldoussoren' ``` bugs.python.org fields: ```python activity = actor = 'brett.cannon' assignee = 'brett.cannon' closed = True closed_date = closer = 'brett.cannon' components = [] creation = creator = 'ronaldoussoren' dependencies = [] files = ['50279'] hgrepos = [] issue_num = 45183 keywords = ['patch', '3.10regression'] message_count = 8.0 messages = ['401698', '401793', '401794', '401935', '402109', '402111', '402114', '402115'] nosy_count = 5.0 nosy_names = ['twouters', 'brett.cannon', 'ronaldoussoren', 'pablogsal', 'miss-islington'] pr_nums = ['28435', '28437', '28438'] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue45183' versions = ['Python 3.10', 'Python 3.11'] ```

    ronaldoussoren commented 3 years ago

    The attached file demonstrates the problem:

    If importlib.invalidate_caches() is called while the zipfile used by the zip importer is not available the import system breaks entirely. I found this in a testsuite that accedently did this (it should have updated sys.path).

    I get the following exception:

    $ python3.10 t.py
    Traceback (most recent call last):
      File "/Users/ronald/Projects/modulegraph2/t.py", line 27, in <module>
        import uu
      File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
      File "<frozen importlib._bootstrap>", line 1002, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 945, in _find_spec
      File "<frozen importlib._bootstrap_external>", line 1430, in find_spec
      File "<frozen importlib._bootstrap_external>", line 1402, in _get_spec
      File "<frozen zipimport>", line 168, in find_spec
      File "<frozen zipimport>", line 375, in _get_module_info
    TypeError: argument of type 'NoneType' is not iterable

    This exception is not very friendly....

    This particular exception is caused by setting self._files to None in the importer's invalidate_caches method, while not checking for None in _get_modules_info.

    I'm not sure what the best fix would be, setting self._files to an empty list would likely be the easiest fix.

    Note that the script runs without errors in Python 3.9.

    pablogsal commented 3 years ago

    I bisected this to:

    3abf6f010243a91bf57cbf357dac33193f7b8407 is the first bad commit commit 3abf6f010243a91bf57cbf357dac33193f7b8407 Author: Desmond Cheong \desmondcheongzx@gmail.com\ Date: Tue Mar 9 04:06:02 2021 +0800

    bpo-14678: Update zipimport to support importlib.invalidate_caches() (GH-24159)
    
    Added an invalidate_caches() method to the zipimport.zipimporter class based on the implementation of importlib.FileFinder.invalidate_caches(). This was done by adding a get_files() method and an _archive_mtime attribute to zipimport.zipimporter to check for updates or cache invalidation whenever the cache of files and toc entry information in the zipimporter is accessed.

    Doc/library/zipimport.rst | 9 + Lib/test/test_zipimport.py | 41 + Lib/zipimport.py | 10 + .../2021-01-07-21-25-49.bpo-14678.1zniCH.rst | 3 + Python/importlib_zipimport.h | 1896 ++++++++++---------- 5 files changed, 1020 insertions(+), 939 deletions(-) create mode 100644 Misc/NEWS.d/next/Library/2021-01-07-21-25-49.bpo-14678.1zniCH.rst bisect run success

    Which is https://bugs.python.org/issue14678

    pablogsal commented 3 years ago

    Brett, can you take a look when you have some time?

    ronaldoussoren commented 3 years ago

    I just noticed that I'm unnecessarily obtuse in my description of a possible fix, the diff (without test update):

    % git diff Lib/zipimport.py                                                                                        (main)cpython
    diff --git a/Lib/zipimport.py b/Lib/zipimport.py
    index c55fec6aa1..43ac6cbe57 100644
    --- a/Lib/zipimport.py
    +++ b/Lib/zipimport.py
    @@ -334,7 +334,7 @@ def invalidate_caches(self):
                 _zip_directory_cache[self.archive] = self._files
             except ZipImportError:
                 _zip_directory_cache.pop(self.archive, None)
    -            self._files = None
    +            self._files = {}
    
         def __repr__(self):

    With that change the exception should not happen, and the now stale zipimporter would be ignored when flushing the cache while the zipfile referenced by the zip importer instance has been removed.

    That said, I haven't tested this and won't create a PR because my local tree is (still) a mess.

    brettcannon commented 3 years ago

    The proposed fix seems to be the right one based on my reading of the code.

    brettcannon commented 3 years ago

    New changeset 209b7035f714dcc41df054b0b023e0b955d7e1a2 by Brett Cannon in branch 'main': bpo-45183: don't raise an exception when calling zipimport.zipimporter.find_spec() when the zip file is missing and the internal cache has been reset (GH-28435) https://github.com/python/cpython/commit/209b7035f714dcc41df054b0b023e0b955d7e1a2

    brettcannon commented 3 years ago

    New changeset e1bdecb6dc7ac33256d5fa875d45634512d2a90e by Brett Cannon in branch '3.10': [3.10] bpo-45183: don't raise an exception when calling zipimport.zipimporter.find_spec() when the zip file is missing and the internal cache has been reset (GH-28435) (bpo-28438) https://github.com/python/cpython/commit/e1bdecb6dc7ac33256d5fa875d45634512d2a90e

    brettcannon commented 3 years ago

    I decided that find_spec() saying something wasn't available in a finder made sense even though it was because a zip file no longer existed as the loader would still fail as appropriate.