PabloCastellano / bormeparser

A Python library for parsing BORME files (Boletín Oficial del Registro Mercantil in Spain).
GNU General Public License v3.0
46 stars 20 forks source link

Update pypdf2 to 2.0.0 #77

Closed pyup-bot closed 2 years ago

pyup-bot commented 2 years ago

This PR updates PyPDF2 from 1.26.0 to 2.0.0.

Changelog ### 2.0.0 ``` and variable-names as well as using properties instead of getter-methods. Maintenance (MAINT): - Remove IronPython Fallback for zlib (868) Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.27.12...1.27.13 Deprecations (DEP) * Make the `PyPDF2.utils` module private * Rename of core classes: * PdfFileReader ➔ PdfReader * PdfFileWriter ➔ PdfWriter * PdfFileMerger ➔ PdfMerger * Use PEP8 conventions for function names and parameters * If a property and a getter-method are both present, use the property Details In many places: - getObject ➔ get_object - writeToStream ➔ write_to_stream - readFromStream ➔ read_from_stream PyPDF2.generic - readObject ➔ read_object - convertToInt ➔ convert_to_int - DocumentInformation.getText ➔ DocumentInformation._get_text : This method should typically not be used; please let me know if you need it. PdfReader class: - `reader.getPage(pageNumber)` ➔ `reader.pages[page_number]` - `reader.getNumPages()` / `reader.numPages` ➔ `len(reader.pages)` - getDocumentInfo ➔ metadata - flattenedPages attribute ➔ flattened_pages - resolvedObjects attribute ➔ resolved_objects - xrefIndex attribute ➔ xref_index - getNamedDestinations / namedDestinations attribute ➔ named_destinations - getPageLayout / pageLayout ➔ page_layout attribute - getPageMode / pageMode ➔ page_mode attribute - getIsEncrypted / isEncrypted ➔ is_encrypted attribute - getOutlines ➔ get_outlines - readObjectHeader ➔ read_object_header - cacheGetIndirectObject ➔ cache_get_indirect_object - cacheIndirectObject ➔ cache_indirect_object - getDestinationPageNumber ➔ get_destination_page_number - readNextEndLine ➔ read_next_end_line - _zeroXref ➔ _zero_xref - _authenticateUserPassword ➔ _authenticate_user_password - _pageId2Num attribute ➔ _page_id2num - _buildDestination ➔ _build_destination - _buildOutline ➔ _build_outline - _getPageNumberByIndirect(indirectRef) ➔ _get_page_number_by_indirect(indirect_ref) - _getObjectFromStream ➔ _get_object_from_stream - _decryptObject ➔ _decrypt_object - _flatten(..., indirectRef) ➔ _flatten(..., indirect_ref) - _buildField ➔ _build_field - _checkKids ➔ _check_kids - _writeField ➔ _write_field - _write_field(..., fieldAttributes) ➔ _write_field(..., field_attributes) - _read_xref_subsections(..., getEntry, ...) ➔ _read_xref_subsections(..., get_entry, ...) PdfWriter class: - `writer.getPage(pageNumber)` ➔ `writer.pages[page_number]` - `writer.getNumPages()` ➔ `len(writer.pages)` - addMetadata ➔ add_metadata - addPage ➔ add_page - addBlankPage ➔ add_blank_page - addAttachment(fname, fdata) ➔ add_attachment(filename, data) - insertPage ➔ insert_page - insertBlankPage ➔ insert_blank_page - appendPagesFromReader ➔ append_pages_from_reader - updatePageFormFieldValues ➔ update_page_form_field_values - cloneReaderDocumentRoot ➔ clone_reader_document_root - cloneDocumentFromReader ➔ clone_document_from_reader - getReference ➔ get_reference - getOutlineRoot ➔ get_outline_root - getNamedDestRoot ➔ get_named_dest_root - addBookmarkDestination ➔ add_bookmark_destination - addBookmarkDict ➔ add_bookmark_dict - addBookmark ➔ add_bookmark - addNamedDestinationObject ➔ add_named_destination_object - addNamedDestination ➔ add_named_destination - removeLinks ➔ remove_links - removeImages(ignoreByteStringObject) ➔ remove_images(ignore_byte_string_object) - removeText(ignoreByteStringObject) ➔ remove_text(ignore_byte_string_object) - addURI ➔ add_uri - addLink ➔ add_link - getPage(pageNumber) ➔ get_page(page_number) - getPageLayout / setPageLayout / pageLayout ➔ page_layout attribute - getPageMode / setPageMode / pageMode ➔ page_mode attribute - _addObject ➔ _add_object - _addPage ➔ _add_page - _sweepIndirectReferences ➔ _sweep_indirect_references PdfMerger class - `__init__` parameter: strict=True ➔ strict=False (the PdfFileMerger still has the old default) - addMetadata ➔ add_metadata - addNamedDestination ➔ add_named_destination - setPageLayout ➔ set_page_layout - setPageMode ➔ set_page_mode Page class: - artBox / bleedBox/ cropBox/ mediaBox / trimBox ➔ artbox / bleedbox/ cropbox/ mediabox / trimbox - getWidth, getHeight ➔ width / height - getLowerLeft_x / getUpperLeft_x ➔ left - getUpperRight_x / getLowerRight_x ➔ right - getLowerLeft_y / getLowerRight_y ➔ bottom - getUpperRight_y / getUpperLeft_y ➔ top - getLowerLeft / setLowerLeft ➔ lower_left property - upperRight ➔ upper_right - mergePage ➔ merge_page - rotateClockwise / rotateCounterClockwise ➔ rotate_clockwise - _mergeResources ➔ _merge_resources - _contentStreamRename ➔ _content_stream_rename - _pushPopGS ➔ _push_pop_gs - _addTransformationMatrix ➔ _add_transformation_matrix - _mergePage ➔ _merge_page XmpInformation class: - getElement(..., aboutUri, ...) ➔ get_element(..., about_uri, ...) - getNodesInNamespace(..., aboutUri, ...) ➔ get_nodes_in_namespace(..., aboutUri, ...) - _getText ➔ _get_text utils.py: - matrixMultiply ➔ matrix_multiply - RC4_encrypt is moved to the security module ``` ### 1.28.4 ``` -------------------------- Bug Fixes (BUG): - XmpInformation._converter_date was unusable (921) Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.28.3...1.28.4 ``` ### 1.28.3 ``` -------------------------- Deprecations (DEP): - PEP8 renaming (905) Bug Fixes (BUG): - XmpInformation missing method _getText (917) - Fix PendingDeprecationWarning on _merge_page (904) Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.28.2...1.28.3 ``` ### 1.28.2 ``` -------------------------- Bug Fixes (BUG): - PendingDeprecationWarning for getContents (893) - PendingDeprecationWarning on using PdfMerger (891) Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.28.1...1.28.2 ``` ### 1.28.1 ``` -------------------------- Bug Fixes (BUG): - Incorrectly show deprecation warnings on internal usage (887) Maintenance (MAINT): - Add stacklevel=2 to deprecation warnings (889) - Remove duplicate warnings imports (888) Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.28.0...1.28.1 ``` ### 1.28.0 ``` -------------------------- This release adds a lot of deprecation warnings in preparation of the ``` ### 1.27.12 ``` --------------------------- Bug Fixes (BUG): - _rebuild_xref_table expects trailer to be a dict (857) Documentation (DOC): - Security Policy Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.27.11...1.27.12 ``` ### 1.27.11 ``` --------------------------- Bug Fixes (BUG): - Incorrectly issued xref warning/exception (855) Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.27.10...1.27.11 ``` ### 1.27.10 ``` --------------------------- Robustness (ROB): - Handle missing destinations in reader (840) - warn-only in readStringFromStream (837) - Fix corruption in startxref or xref table (788 and 830) Documentation (DOC): - Project Governance (799) - History of PyPDF2 - PDF feature/version support (816) - More details on text parsing issues (815) Developer Experience (DEV): - Add benchmark command to Makefile - Ignore IronPython parts for code coverage (826) Maintenance (MAINT): - Split pdf module (836) - Separated CCITTFax param parsing/decoding (841) - Update requirements files Testing (TST): - Use external repository for larger/more PDFs for testing (820) - Swap incorrect test names (838) - Add test for PdfFileReader and page properties (835) - Add tests for PyPDF2.generic (831) - Add tests for utils, form fields, PageRange (827) - Add test for ASCII85Decode (825) - Add test for FlateDecode (823) - Add test for filters.ASCIIHexDecode (822) Code Style (STY): - Apply pre-commit (black, isort) + use snake_case variables (832) - Remove debug code (828) - Documentation, Variable names (839) Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.27.9...1.27.10 ``` ### 1.27.9 ``` -------------------------- A change I would like to highlight is the performance improvement for large PDF files (808) 🎉 New Features (ENH): - Add papersizes (800) - Allow setting permission flags when encrypting (803) - Allow setting form field flags (802) Bug Fixes (BUG): - TypeError in xmp._converter_date (813) - Improve spacing for text extraction (806) - Fix PDFDocEncoding Character Set (809) Robustness (ROB): - Use null ID when encrypted but no ID given (812) - Handle recursion error (804) Documentation (DOC): - CMaps (811) - The PDF Format + commit prefixes (810) - Add compression example (792) Developer Experience (DEV): - Add Benchmark for Performance Testing (781) Maintenance (MAINT): - Validate PDF magic byte in strict mode (814) - Make PdfFileMerger.addBookmark() behave life PdfFileWriters\' (339) - Quadratic runtime while parsing reduced to linear (808) Testing (TST): - Newlines in text extraction (807) Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.27.8...1.27.9 ``` ### 1.27.8 ``` -------------------------- Bug Fixes (BUG): - Use 1MB as offset for readNextEndLine (321) - 'PdfFileWriter' object has no attribute 'stream' (787) Robustness (ROB): - Invalid float object; use 0 as fallback (782) Documentation (DOC): - Robustness (785) Full Changelog: https://github.com/py-pdf/PyPDF2/compare/1.27.7...1.27.8 ``` ### 1.27.7 ``` -------------------------- Bug Fixes (BUG): - Import exceptions from PyPDF2.errors in PyPDF2.utils (780) Code Style (STY): - Naming in 'make_changelog.py' ``` ### 1.27.6 ``` -------------------------- Deprecations (DEP): - Remove support for Python 2.6 and older (776) New Features (ENH): - Extract document permissions (320) Bug Fixes (BUG): - Clip by trimBox when merging pages, which would otherwise be ignored (240) - Add overwriteWarnings parameter PdfFileMerger (243) - IndexError for getPage() of decryped file (359) - Handle cases where decodeParms is an ArrayObject (405) - Updated PDF fields don't show up when page is written (412) - Set Linked Form Value (414) - Fix zlib -5 error for corrupt files (603) - Fix reading more than last1K for EOF (642) - Acciental import Robustness (ROB): - Allow extra whitespace before "obj" in readObjectHeader (567) Documentation (DOC): - Link to pdftoc in Sample_Code (628) - Working with annotations (764) - Structure history Developer Experience (DEV): - Add issue templates (765) - Add tool to generate changelog Maintenance (MAINT): - Use grouped constants instead of string literals (745) - Add error module (768) - Use decorators for staticmethod (775) - Split long functions (777) Testing (TST): - Run tests in CI once with -OO Flags (770) - Filling out forms (771) - Add tests for Writer (772) - Error cases (773) - Check Error messages (769) - Regression test for issue 88 - Regression test for issue 327 Code Style (STY): - Make variable naming more consistent in tests All changes: https://github.com/py-pdf/PyPDF2/compare/1.27.5...1.27.6 ``` ### 1.27.5 ``` -------------------------- Security (SEC): - ContentStream_readInlineImage had potential infinite loop (740) Bug fixes (BUG): - Fix merging encrypted files (757) - CCITTFaxDecode decodeParms can be an ArrayObject (756) Robustness improvements (ROBUST): - title sometimes None (744) Documentation (DOC): - Adjust short description of the package Tests and Test setup (TST): - Rewrite JS tests from unittest to pytest (746) - Increase Test coverage, mainly with filters (756) - Add test for inline images (758) Developer Experience Improvements (DEV): - Remove unused Travis-CI configuration (747) - Show code coverage (754, 755) - Add mutmut (760) Miscellaneous: - STY: Closing file handles, explicit exports, ... (743) All changes: https://github.com/py-pdf/PyPDF2/compare/1.27.4...1.27.5 ``` ### 1.27.4 ``` -------------------------- Bug fixes (BUG): - Guard formatting of __init__.__doc__ string (738) Packaging (PKG): - Add more precise license field to setup (733) Testing (TST): - Add test for issue 297 Miscellaneous: - DOC: Miscallenious ➔ Miscellaneous (Typo) - TST: Fix CI triggering (master ➔ main) (739) - STY: Fix various style issues (742) All changes: https://github.com/py-pdf/PyPDF2/compare/1.27.3...1.27.4 ``` ### 1.27.3 ``` -------------------------- - PKG: Make Tests not a subpackage (728) - BUG: Fix ASCII85Decode.decode assertion (729) - BUG: Error in Chinese character encoding (463) - BUG: Code duplication in Scripts/2-up.py - ROBUST: Guard 'obj.writeToStream' with 'if obj is not None' - ROBUST: Ignore a /Prev entry with the value 0 in the trailer - MAINT: Remove Sample_Code (726) - TST: Close file handle in test_writer (722) - TST: Fix test_get_images (730) - DEV: Make tox use pytest and add more Python versions (721) - DOC: Many (720, 723-725, 469) All changes: https://github.com/py-pdf/PyPDF2/compare/1.27.2...1.27.3 ``` ### 1.27.2 ``` -------------------------- - Add Scripts (including `pdfcat`), Resources, Tests, and Sample_Code back to PyPDF2. It was removed by accident in 1.27.0, but might get removed with 2.0.0 See https://github.com/py-pdf/PyPDF2/discussions/718 for discussion All changes: https://github.com/py-pdf/PyPDF2/compare/1.27.1...1.27.2 ``` ### 1.27.1 ``` -------------------------- - Fixed project links on PyPI page after migration from mstamy2 to MartinThoma to the py-pdf organization on GitHub - Documentation is now at https://pypdf2.readthedocs.io/en/latest/ All changes: https://github.com/py-pdf/PyPDF2/compare/1.27.0...1.27.1 ``` ### 1.27.0 ``` -------------------------- Features: - Add alpha channel support for png files in Script (614) Bug fixes (BUG): - Fix formatWarning for filename without slash (612) - Add whitespace between words for extractText() (569, 334) - "invalid escape sequence" SyntaxError (522) - Avoid error when printing warning in pythonw (486) - Stream operations can be List or Dict (665) Documentation (DOC): - Added Scripts/pdf-image-extractor.py - Documentation improvements (550, 538, 324, 426, 394) Tests and Test setup (TST): - Add Github Action which automatically run unit tests via pytest and static code analysis with Flake8 (660) - Add several unit tests (661, 663) - Add .coveragerc to create coverage reports Developer Experience Improvements (DEV): - Pre commit: Developers can now `pre-commit install` to avoid tiny issues like trailing whitespaces Miscellaneous: - Add the LICENSE file to the distributed packages (288) - Use setuptools instead of distutils (599) - Improvements for the PyPI page (644) - Python 3 changes (504, 366) You can see the full changelog at: https://github.com/py-pdf/PyPDF2/compare/1.26.0...1.27.0 ```
Links - PyPI: https://pypi.org/project/pypdf2 - Changelog: https://pyup.io/changelogs/pypdf2/ - Docs: https://pypdf2.readthedocs.io/en/latest/
pyup-bot commented 2 years ago

Closing this in favor of #78