python / cpython

The Python programming language
https://www.python.org
Other
62.85k stars 30.1k forks source link

Add data_offset field to ZipInfo #89169

Open 8f73a989-2ab5-40e0-be5b-22c682871df2 opened 3 years ago

8f73a989-2ab5-40e0-be5b-22c682871df2 commented 3 years ago
BPO 45006
Nosy @ammaraskar, @zhangxp1998
PRs
  • python/cpython#27961
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-feature', 'library', '3.11'] title = 'Add data_offset field to ZipInfo' updated_at = user = 'https://github.com/zhangxp1998' ``` bugs.python.org fields: ```python activity = actor = 'ammar2' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'zhangxp1998' dependencies = [] files = [] hgrepos = [] issue_num = 45006 keywords = ['patch'] message_count = 2.0 messages = ['400306', '403316'] nosy_count = 2.0 nosy_names = ['ammar2', 'zhangxp1998'] pr_nums = ['27961'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue45006' versions = ['Python 3.11'] ```

    8f73a989-2ab5-40e0-be5b-22c682871df2 commented 3 years ago

    Currently python's zipfile module does not have a way query starting offset of compressed data. This might be handy when the user wants to copy compressed data as is. Therefore I propose adding a data_offset field to zipfile.ZipInfo, which stores the offset to beginning of compressed data.

    ammaraskar commented 3 years ago

    Could you explain your use-case for this feature in a bit more detail? zipfile is meant to be a relatively high level library to do common tasks such as reading/writing/listing files.

    The use case for data_offset proposed here seems to be relatively advanced and I don't see how it would be to useful for the vast majority of users.

    (Without adding to the public API, I think you can achieve a pretty similar functionality by using the following)

    compressed_data = zipfile.open(zipinfo)._read2(compressed_size)

    Obviously, this relies on undocumented internals, but for a niche use case that might not be the worst thing: https://github.com/python/cpython/blob/61892c04764e1f3a659bbd09e6373687a27d36e2/Lib/zipfile.py#L1042-L1056