Summary
The python-docx error docx.opc.exceptions.PackageNotFoundError arises both when no file exists at the given path and when the file exists but is not a ZIP archive (and so is not a DOCX file).
This ambiguity is unwelcome when diagnosing the error as the two possible conditions generally indicate a different course of action to resolve the error.
Add detailed validation to DocxPartitionerOptions to distinguish these two and provide more precise exception messages.
Additional Context
python-pptx shares the same OPC-Package (file) loading code used by python-docx, so the same ambiguity will be present in python-pptx.
It would be preferable for this distinguished exception behavior to be upstream in python-docx and python-pptx. If we're willing to take the version bump it might be worth considering doing that instead.
Summary The
python-docx
errordocx.opc.exceptions.PackageNotFoundError
arises both when no file exists at the given path and when the file exists but is not a ZIP archive (and so is not a DOCX file).This ambiguity is unwelcome when diagnosing the error as the two possible conditions generally indicate a different course of action to resolve the error.
Add detailed validation to
DocxPartitionerOptions
to distinguish these two and provide more precise exception messages.Additional Context
python-pptx
shares the same OPC-Package (file) loading code used bypython-docx
, so the same ambiguity will be present inpython-pptx
.python-docx
andpython-pptx
. If we're willing to take the version bump it might be worth considering doing that instead.