google-research / arxiv-latex-cleaner

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Apache License 2.0
5.23k stars 327 forks source link

Regex-escape filename components #84

Closed jaywonchung closed 9 months ago

jaywonchung commented 10 months ago

This PR regex-escapes filename components so that file/directory names that contain special regex characters do not affect regex matching. Existing logic in _search_reference for the strict=False codepath were unnecessarily complicated, so along with this change, I refactored existing logic using the built-in pathlib.Path.

I experienced this issue because my figure PDF files had a lot of + signs in it and even if they were includegraphics'ed, they were not being copied. This change addressed the issue. I ran python arxiv_latex_cleaner/tests/arxiv_latex_cleaner_test.py and all 81 tests passed. I believe this PR should also address #79.

google-cla[bot] commented 10 months ago

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

jponttuset commented 9 months ago

Thanks @jaywonchung!