proycon / codemetapy

A Python package for generating and working with codemeta
https://codemeta.github.io/
GNU General Public License v3.0
24 stars 5 forks source link

Pyproject-based parser fails on some valid input files #40

Closed apirogov closed 1 year ago

apirogov commented 1 year ago

Here is a valid pyproject.toml on which codemetapy fails, even though it should not:

[tool.poetry]
name = "dummy-project"
version = "0.1.0"
description = ""
authors = ["John Doe <j.doe@example.com"]
readme = "README.md"
packages = [{include = "dummy_project"}]

include = [
  # having both a string and an object here seems to trigger the problem:
  "CHANGELOG.md",
  { path = "tests", format = "sdist" },
  # ----
]

[tool.poetry.dependencies]
python = "^3.8"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

Stacktrace for codemetapy pyproject.toml:

Passed 1 files/sources but specified 0 input types! Automatically guessing types...
Detected input types: [('pyproject.toml', 'python')]
Note: You did not specify a --baseuri so we will not provide identifiers (IRIs) for your SoftwareSourceCode resources (and others)
Initial URI automatically generated, may be overriden later: file:///pyproject.toml
Processing source #1 of 1
Obtaining python package metadata for: pyproject.toml
Loading metadata from pyproject.toml via pyproject-parser
Failed to process pyproject.toml via pyproject-parser: list index out of range
Fallback: Loading metadata from pyproject.toml via PEP517
Traceback (most recent call last):
  File "/home/a.pirogov/.local/bin/codemetapy", line 8, in <module>
    sys.exit(main())
  File "/local/home/a.pirogov/.local/pipx/venvs/codemetapy/lib/python3.8/site-packages/codemeta/codemeta.py", line 148, in main
    g, res, args, contextgraph = build(**args.__dict__)
  File "/local/home/a.pirogov/.local/pipx/venvs/codemetapy/lib/python3.8/site-packages/codemeta/codemeta.py", line 390, in build
    codemeta.parsers.python.parse_python(newgraph, res, source, crosswalk, args)
  File "/local/home/a.pirogov/.local/pipx/venvs/codemetapy/lib/python3.8/site-packages/codemeta/parsers/python.py", line 152, in parse_python
    packagename = pkg.name
AttributeError: 'PathDistribution' object has no attribute 'name'
proycon commented 1 year ago

Hmm, that's unfortunate indeed. As you already stated, it seems the bug is in the pyproject-parser project rather than in codemetapy itself. I investigated a bit further:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/pyproject_parser/__init__.py", line 176, in load
    config = dom_toml.load(filename)
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/dom_toml/__init__.py", line 217, in load
    return loads(
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/dom_toml/__init__.py", line 171, in loads
    return toml.loads(  # type: ignore[return-value]
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/toml/decoder.py", line 511, in loads
    ret = decoder.load_line(line, currentlevel, multikey,
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/toml/decoder.py", line 778, in load_line
    value, vtype = self.load_value(pair[1], strictly_valid)
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/toml/decoder.py", line 880, in load_value
    return (self.load_array(v), "array")
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/toml/decoder.py", line 1002, in load_array
    a[b] = a[b] + ',' + a[b + 1]
IndexError: list index out of range

It's probably best to submit this issue to them rather than seeking a workaround in codemetapy.

proycon commented 1 year ago

On a closer it's even the toml module in the standard library that seems to be the culprit. Is such mixed content really valid toml? I'd think so too but I'd be surprised if a module in the standard library is that bugged:

>>> toml.load("pyproject.toml")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/toml/decoder.py", line 134, in load
    return loads(ffile.read(), _dict, decoder)
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/toml/decoder.py", line 511, in loads
    ret = decoder.load_line(line, currentlevel, multikey,
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/toml/decoder.py", line 778, in load_line
    value, vtype = self.load_value(pair[1], strictly_valid)
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/toml/decoder.py", line 880, in load_value
    return (self.load_array(v), "array")
  File "/home/proycon/work/codemetapy/env/lib/python3.10/site-packages/toml/decoder.py", line 1002, in load_array
    a[b] = a[b] + ',' + a[b + 1]
IndexError: list index out of range
proycon commented 1 year ago

I guess the toml wasn't valid after all, I tried https://www.toml-lint.com/ and that complains as well with:

Error on line 11:

{"path":"tests","format":"sdist"} should be of type "String".

Closing this issue.

apirogov commented 1 year ago

Oh, you're apparently right... Interestingly, it works just fine with poetry, so I just assumed the toml to be fine - but probably it uses a parser that is more liberal than the official TOML spec.

apirogov commented 1 year ago

Oh reading further the thread, in TOML 1.0 they decided to allow it

and on the website is says: mixing types is allowed!

https://toml.io/en/v1.0.0#array

I'll open an issue at the parser library.

EDIT: here it is: https://github.com/repo-helper/pyproject-parser/issues/47

proycon commented 1 year ago

Hmm, interesting! That would mean that the implementations are a bit behind still. Hopefully this issue will self-correct once the toml implementation in Python catches up. I assume they'll aim to implement all of TOML 1.0.