pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
https://pymupdf.readthedocs.io
GNU Affero General Public License v3.0
5.17k stars 495 forks source link

draft: Add codespell CI job and fix some typos #3501

Closed cbm755 closed 2 months ago

cbm755 commented 4 months ago

Basic idea here is to run codespell on all merge requests. There are two files to place exceptions in, but in my experience these are rare and it generally does a good job, fixing more things than the hassle of the occasional exception.

I have excluded the tests/ dir b/c there are lots of false positives there (mostly b/c of non-English text). It should be fairly self explanatory how to exclude other directories in the future if you need.

cbm755 commented 4 months ago

@JorjMcKie Do you know what "parm angle" means here? does it mean "reduce angle parameter by"?

            betar -= w90  # reduce parm angle by 90 deg
            alfa += w90  # advance start angle by 90 deg

edit: I went ahead and removed that comment, seems a little WET anyway.

cbm755 commented 4 months ago

Only two more errors where I'm uncertain what change to make (see above).

After we make those, this is ready to go!

If you like feel free to move the job to some other .yaml file.

cbm755 commented 4 months ago

Did I break this test somehow?

Traceback (most recent call last):
  File "/home/runner/work/PyMuPDF/PyMuPDF/scripts/gh_release.py", line 714, in <module>
    main()
  File "/home/runner/work/PyMuPDF/PyMuPDF/scripts/gh_release.py", line 115, in main
    build(valgrind=valgrind)
  File "/home/runner/work/PyMuPDF/PyMuPDF/scripts/gh_release.py", line 378, in build
    run( f'cibuildwheel{platform_arg}', env_extra=env_extra)
  File "/home/runner/work/PyMuPDF/PyMuPDF/scripts/gh_release.py", line 685, in run
    return subprocess.run(command, check=check, shell=1, env=env, timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.12.3/x64/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'cibuildwheel' returned non-zero exit status 1.
julian-smith-artifex-com commented 4 months ago

Thanks for this PR, i hadn't come across codespell before, and it looks to be doing some very useful checking here.

I think there's a trivial indentation error in the patch that's causing the test failure.

I think we should run codespell as part of the existing pytest test suite, alongside pylint and flake8, rather than only has a Github action. I can make that change but alternatively feel free to add a new file tests/test_codespell.py in your PR, similar to tests/test_pylint.py and tests/test_flake8. We'll also need to add codespell to scripts/gh_release.py:test_packages.

cbm755 commented 4 months ago

Thanks @julian-smith-artifex-com I think I've done those changes but the test is still failing :(

I think there's a trivial indentation error in the patch that's causing the test failure.

Is it clear to you what that change was? thanks!

cbm755 commented 4 months ago

Would you like this rebased and some of my small testing commits squashed into cleaner commits?

julian-smith-artifex-com commented 3 months ago

Thanks @julian-smith-artifex-com I think I've done those changes but the test is still failing :(

I think there's a trivial indentation error in the patch that's causing the test failure.

Is it clear to you what that change was? thanks!

There's an error message from test_quick:

2024-05-22T17:54:04.4106750Z E     File "/tmp/tmp.9IvmIxA8FB/venv/lib/python3.12/site-packages/pymupdf/__init__.py", line 11304
2024-05-22T17:54:04.4107721Z E       raise ValueError("radius must be positive")
2024-05-22T17:54:04.4108315Z E   IndentationError: unexpected indent

So init.py", line 11304 is indented incorrectly in your branch i think?

julian-smith-artifex-com commented 3 months ago

Apologies for taking a while to look at this again.

Things that need updating:

  1. Fix indentation error src/utils.py:11304. Hopefully test_quick will then pass.
  2. Remove running of codespell from test_quick.yml - not needed now it's a standard test.
  3. Could we remove the small .codespell* files and pass the excludes on the command line from the test fn?

I'd be happy to do some/all of these (preserving your name in the commits) if that's easier, but equally happy for you to do things.

Thanks.

julian-smith-artifex-com commented 2 months ago

I've taken the liberty of extracting your commit with the new codespell test into a new PR #3656.

This has a smaller set of fixes, with a view to getting a PR for codespell (https://github.com/codespell-project/codespell/pull/3476) accepted which will allow us to mark multiline regions of input files to be excluded from codespell checking.

cbm755 commented 2 months ago

thanks and sorry for not getting back to you with revisions. Happy to see it merged.