Closed jsirois closed 2 weeks ago
Ok, 94b1e2c was a big addition:
:; git diff --stat 94b1e2c03a7fd9c49f479b2885a294e7ac00b519^!
pex/pep_723.py | 237 +++++++++++++++++++++++++++++++++++++++++++------
tests/test_pep_723.py | 240 +++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 447 insertions(+), 30 deletions(-)
That's the price you pay when not normal users, but programmers (who presumably know how to debug), lose their debugging skills when the code is not theirs. I'm obviously grumpy about this state of the world, but I can play along.
I am working my way through this while reading the PEP. Will respond hopefully later today, europe time.
Is that the case you noticed?
Yes. I had written the test for the over-abundant script block case and it was failing to see >1 block.
The regex did already work for the greedy case; i.e. the case where the spec leads into an embedded c-sharp comment example with "Precedence for an ending line # /// is given when the next line is not a valid embedded content line as described above.". It just didn't work for multiple blocks.
Thanks for reading through all this @benjyw.
Anyway, I share your grumpiness.
Multiple levels here.
Hey there, thank you for taking the time to implement this! I have two questions:
For 1, given this example.py
:
import re
from textwrap import dedent
REGEX = r'(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$'
def collect_metadata_blocks(text):
return list(re.finditer(REGEX, text))
EXAMPLE1 = dedent(
"""\
# /// script
# ///
# /// future
# ///
"""
)
EXAMPLE2 = dedent(
"""\
# /// script
# dependencies = ["ansicolors"]
# ///
#
# /// future
# who_knows = 42
# ///
"""
)
example1 = collect_metadata_blocks(EXAMPLE1)
print(example1)
for index, match in enumerate(example1):
print(index, match.group("type"), match.group("content"))
example2 = collect_metadata_blocks(EXAMPLE2)
print(example2)
for index, match in enumerate(example2):
print(index, match.group("type"), match.group("content"))
You find:
:; python example.py
[<re.Match object; span=(0, 38), match='# /// script\n# ///\n# /// future \n# ///'>]
0 script # ///
# /// future
[<re.Match object; span=(0, 88), match='# /// script\n# dependencies = ["ansicolors"]\n# >]
0 script # dependencies = ["ansicolors"]
# ///
#
# /// future
# who_knows = 42
So the re is too greedy for practical use afaict. It forces you to write 2 script blocks in separate comments (a non "#", non "# "* line is required between them. The parser I hand-rolled always breaks a block that's seen a # ///
when a new block starts since having a block start inside another block is called out as invalid by the spec. Note that the re happily accepts those here by its definition of a script block.
For 2, I don't think you were the problem here.
>=3.8,<3.11
) code outside either's control for non-deprecated Python versions seems ridiculous and dangerous to me on its face. Perhaps PyPA tried and failed to adopt toml
or tomli
/ tomli-w
and deemed splitting the community further by writing their own under https://github.com/PyPA/ was undesirable or that it was simply too much effort.Belatedly I just realized this is an almost useful thing now:
#!/usr/bin/env -S pex --exe
# /// script
# ...
# ///
...
So if you know your target environment has recent enough Pex installed on the path. instead of scp
'ing a PEX file, you scp
just the script. Kind of a party trick though since its very slow to start up that way.
Pex now reads script metadata present in
--exe
s by default to augment requirements and constrain allowed target Pythons. Pex can be instructed to ignore this script metadata though with--no-pep723
/--no-script-metadata
.This apparently makes due on a promise no active Pex maintainer ever made, as immortalized in PEP-723: https://peps.python.org/pep-0723/#tooling-buy-in