beeware / briefcase

Tools to support converting a Python project into a standalone native application.
https://briefcase.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
2.48k stars 353 forks source link

Fix minor holes in PEP-508 name validation #1762

Closed rmartin16 closed 2 months ago

rmartin16 commented 2 months ago

Changes

^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$

- The implementation of `re.IGNORECASE` is ostensibly coercing the case of the input because it will match on non-ASCII.
  - This matters for, at least, İ ([0x130](https://everythingfonts.com/unicode/0x0130)) and K ([0x212a](https://everythingfonts.com/unicode/0x212A)).
```pycon
>>> import re
>>> PEP508_NAME_RE = re.compile(r"^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$", re.IGNORECASE)
>>> 
>>> bool(PEP508_NAME_RE.match("helloworld"))
True
>>> bool(PEP508_NAME_RE.match("İstanbul"))
True
>>> bool(PEP508_NAME_RE.match("Kelvin"))
True
>>> bool(PEP508_NAME_RE.match("Æolia"))
False
>>> bool(PEP508_NAME_RE.match("jalapeño"))
False
>>> bool(PEP508_NAME_RE.match("Beyoncé"))
False
>>> bool(PEP508_NAME_RE.match("naïve"))
False

Related

PR Checklist:

rmartin16 commented 2 months ago

PEP-508 says PyPI uses this regex...and indeed, it appears they do.

However, packaging.requirements.Requirement doesn't use re.IGNORECASE....so, even if you get such a package on to PyPI, it seems likely pip or another tool will reject it at some point.