alephdata / ingest-file

Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.
GNU Affero General Public License v3.0
54 stars 25 forks source link

Bump pymupdf from 1.21.1 to 1.23.7 #554

Closed dependabot[bot] closed 8 months ago

dependabot[bot] commented 8 months ago

Bumps pymupdf from 1.21.1 to 1.23.7.

Release notes

Sourced from pymupdf's releases.

PyMuPDF-1.23.7 released

PyMuPDF-1.23.7 has been released.

Wheels for Windows, Linux and MacOS, and the sdist, are available on pypi.org and can be installed in the usual way, for example:

python -m pip install --upgrade pymupdf

[Linux-aarch64 wheels are not available yet, they will be build and uploaded later.]

Changes in version 1.23.7 (2023-11-30)

  • Bug fixes in rebased implementation, not fixed in classic implementation:

  • Bug fixes (rebased and classic implementations):

  • Use MuPDF-1.23.7.

  • Other:

    • Rebased implementation:

      • Added flake8 code checking to test suite, and made various fixes.
      • Disable diagnostics during Document constructor to match classic implementation.
    • Additional fix to #2553

    • Fixed MuPDF Bug 707324 <https://bugs.ghostscript.com/show_bug.cgi?id=707324>_: Story: HTML table row background color repeated incorrectly

    • Added scripts/test.py, for simple build+test of PyMuPDF git checkout.

    • Added fitz.pymupdf_version_tuple, e.g. (1, 23, 6).

    • Restored mistakenly-reverted fix for #2345.

    • Include any trailing ... repeated <N> times... text in warnings returned by mupdf_warnings() (rebased only).

PyMuPDF-1.23.6 released

PyMuPDF-1.23.6 has been released.

Wheels for Windows, Linux and MacOS, and the sdist, are available on pypi.org and can be installed in the usual way, for example:

</tr></table> 

... (truncated)

Changelog

Sourced from pymupdf's changelog.

Change Log

Changes in version 1.23.7 (2023-11-30)

  • Bug fixes in rebased implementation, not fixed in classic implementation:

    • Fixed 2232 <https://github.com/pymupdf/PyMuPDF/issues/2232>_: Geometry helper classes should support keyword arguments
    • Fixed 2788 <https://github.com/pymupdf/PyMuPDF/issues/2788>_: Problem with get_toc in pymupdf 1.23.6
    • Fixed 2791 <https://github.com/pymupdf/PyMuPDF/issues/2791>_: Experiencing small memory leak in save()
  • Bug fixes (rebased and classic implementations):

    • Fixed 2736 <https://github.com/pymupdf/PyMuPDF/issues/2736>_: Failure when set cropbox with mediabox negative value
    • Fixed 2749 <https://github.com/pymupdf/PyMuPDF/issues/2749>_: RuntimeError: cycle in structure tree
    • Fixed 2753 <https://github.com/pymupdf/PyMuPDF/issues/2753>_: Story.write_with_links will ignore everything after the first "page break" in the HTML.
    • Fixed 2812 <https://github.com/pymupdf/PyMuPDF/issues/2812>_: find_tables on landscape page generates reversed text
    • Fixed 2829 <https://github.com/pymupdf/PyMuPDF/issues/2829>_: [cannot create /Annot for kind] is still printed despite #2345 is closed.
    • Fixed 2841 <https://github.com/pymupdf/PyMuPDF/issues/2841>_: Unexpected KeyError when using scrub with fitz_new
  • Use MuPDF-1.23.7.

  • Other:

    • Rebased implementation:

      • Added flake8 code checking to test suite, and made various fixes.
      • Disable diagnostics during Document constructor to match classic implementation.
    • Additional fix to 2553 <https://github.com/pymupdf/PyMuPDF/issues/2553>_: Invalid characters in versions >= 1.22

    • Fixed MuPDF Bug 707324 <https://bugs.ghostscript.com/show_bug.cgi?id=707324>_: Story: HTML table row background color repeated incorrectly

    • Added scripts/test.py, for simple build+test of PyMuPDF git checkout.

    • Added fitz.pymupdf_version_tuple, e.g. (1, 23, 6).

    • Restored mistakenly-reverted fix for 2345 <https://github.com/pymupdf/PyMuPDF/issues/2345>_: Turn off print statements in utils.py

    • Include any trailing ... repeated <N> times... text in warnings returned by mupdf_warnings() (rebased only).

Changes in version 1.23.6 (2023-11-06)

  • Bug fixes:

    • Fixed 2553 <https://github.com/pymupdf/PyMuPDF/issues/2553>_: Invalid characters in versions >= 1.22
    • Fixed 2608 <https://github.com/pymupdf/PyMuPDF/issues/2608>_: Incorrect utf32 text extraction (high & low surrogates are split)
    • Fixed 2710 <https://github.com/pymupdf/PyMuPDF/issues/2710>_: page.rect and text location wrong / differing from older version
    • Fixed 2774 <https://github.com/pymupdf/PyMuPDF/issues/2774>_: wrong encoding for "?" character when sort=True
    • Fixed 2775 <https://github.com/pymupdf/PyMuPDF/issues/2775>_: fitz_new does not work with python3.10 or earlier
    • Fixed 2777 <https://github.com/pymupdf/PyMuPDF/issues/2777>_: With fitz_new, wrong type for Page.mediabox

... (truncated)

Commits
  • bde223b Updates for release of PyMuPDF-1.23.7.
  • e1bad0b @​airgalss has signed the CLA from Pull Request #2845
  • 6c72243 changes.txt: updates for recent code changes.
  • f7183f5 tests/test_2548.py: test_2753(): fix to match diagnostics in mupdf-1.23.7.
  • 0da7a2a src/init.py: mupdf_warnings(): also get any trailing `... repeated <N> ti...
  • 4dc570d src/init.py:Page.delete_link() address #2841 - remove exception diagnostic.
  • 3c6c808 docs/installation.rst: added info about Windows MSVCP140.dll bug.
  • eed73d7 changes.txt: updates.
  • 342cc68 tests/test_general.py:test_2553_2(): expect bug not fixed on mupdf < 1.23.7.
  • 8f75647 Update test_general.py
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
dependabot[bot] commented 8 months ago

Superseded by #566.