tweag / FawltyDeps

Python dependency checker
Other
187 stars 14 forks source link

More comprehensive error handling when parsing pyproject.toml #62

Open jherland opened 1 year ago

jherland commented 1 year ago

(I'm singling out pyproject.toml here, not because it is special, but rather because there is a difference between what we get from the parser (tomllib only ensures valid TOML) and what we can consider properly-validated data structures. With the other formats the validated data structure are either very simple (e.g. requirements.txt) or the parser has already validated most of them for us (ast for setup.py and for when we're parsing imports). Otherwise this issue is mostly summarizing discussions we've had in and around PR #34.)

We should probably split the code between data validation and data extraction:

In other words:

It is worth researching if pydantic is a good solution for the validation phase. I suspect it could be since the data structure we get back from tomllib are in many ways similar to the JSON structures that pydantic is so good at handling.

pawamoy commented 1 year ago

Or you could use pyproject-validate which validates pyproject.toml based on a JSON schema! However it would fail to validate if fields that are irrelevant to FawltyDeps are not valid, which might not be what you want.

jherland commented 1 year ago

Exactly, as you say, we're not really interested in validating the entire pyproject.toml (as that is outside our project scope), only the parts that are relevant to us. Looking at what pyproject-validate does is definitely interesting when we get around to addressing this issue, though. Thanks for the link!