python-openxml / python-docx

Create and modify Word documents with Python
MIT License
4.38k stars 1.08k forks source link

feat: distinguish "no such file" from "not a ZIP" #1410

Open scanny opened 3 weeks ago

scanny commented 3 weeks ago

Summary python-docx raises:

docx.opc.exceptions.PackageNotFoundError: Package not found at '/a/b/c.docx'

on Document("/a/b/c.docx") when the file or path provided does not resolve to a ZIP archive. For diagnostic purposes it would be better to distinguish a No such file or directory condition from a file exists but is not a ZIP archive (and so not a DOCX file) condition.

Proposed

  1. Add a separate os.file.isfile() test on a provided file-path before attempting to open it with zipfile. Give this a focused message like FileNotFoundError: No such file or directory: '/a/b/c.docx' so the problem is unambiguous.
  2. Change the PackageNotFoundError text to more specifically indicate the file is not a zip archive.