pdm-project / pdm

A modern Python package and dependency manager supporting the latest PEP standards
https://pdm-project.org
MIT License
7.91k stars 396 forks source link

Ignore non-semantic differences (some whitespace etc.) with lock file hash calculation #3184

Open sanmai-NL opened 1 month ago

sanmai-NL commented 1 month ago

Is your feature/enhancement proposal related to a problem? Please describe.

When running tools to format TOML files, they may disagree with PDM on layout. When these differences occur, it's annoying that the lock file hash becomes out of sync. (E.g., after a PDM update). The same with comments.

Describe the solution you'd like

Do a TOML serialization, deserialization roundtrip before calculating the hash value.

frostming commented 1 month ago

Not clear what you mean, since we will serialize it to JSON before calculating hash, what non-semantic content is there?

https://github.com/pdm-project/pdm/blob/af4267b5d495bffcdbe4c1b5285a865728ac72fe/src/pdm/project/project_file.py#L86

sanmai-NL commented 1 month ago

Ah, so the problem wasn't whitespace or key sorting, sorry!

$ pdm lock -G :all
⏳ Started on 2024-09-25T10:32:42Z
Changes are written to pdm.lock.
  0:00:39 🔒 Lock successful.  
INFO: PDM 2.19.0 is installed, while 2.19.1 is available.
Please run `brew upgrade pdm` to upgrade.
Run `pdm config check_update false` to disable the check.
⌛ Finished on 2024-09-25T10:33:22Z
$ pdm lock -G :all --check
⏳ Started on 2024-09-25T10:34:19Z
⌛ Finished on 2024-09-25T10:34:20Z
$ pdm run toml-sort --in-place *.toml
⏳ Started on 2024-09-25T10:34:51Z
⌛ Finished on 2024-09-25T10:34:52Z
$ pdm lock -G :all --check
⏳ Started on 2024-09-25T10:35:13Z
WARNING: Lockfile hash doesn't match pyproject.toml, packages may be outdated
⌛ Finished on 2024-09-25T10:35:13Z

These are the kinds of changes toml-sort did, but not in this case. I can't reproduce the result with the above steps reliably ...

$  diff pyproject.toml pyproject.toml.bak
⏳ Started on 2024-09-25T10:42:55Z
13c13
<   "mypackage>=2024.9.24.14.17.11",
---
>     "mypackage>=2024.9.24.14.17.11",
77c77
<   "pdm-backend>=2.4.1",
---
>     "pdm-backend>=2.4.1",
80,99c80,99
<   "mypackage>=2024.9.24.14.17.11",
<   "black[jupyter]>=24.8.0",
<   "coverage>=7.6.1",
<   "creosote>=3.0.2",
<   "debugpy>=1.8.6",
<   "isort>=5.13.2",
<   "mdformat-black>=0.1.1",
<   "mdformat-footnote>=0.1.1",
<   "mdformat-frontmatter>=2.0.8",
<   "mdformat-gfm>=0.3.6",
<   "mdformat-tables>=1.0.0",
<   "mdformat>=0.7.17",
<   "mypy>=1.11.2",
<   "pycodestyle>=2.12.1",
<   "pylint>=3.3.1",
<   "python-dotenv>=1.0.1",
<   "rope>=1.13.0",
<   "ruff>=0.6.7",
<   "toml-sort>=0.23.1",
<   "vulture>=2.12",
---
>     "mypackage>=2024.9.24.14.17.11",
>     "black[jupyter]>=24.8.0",
>     "coverage>=7.6.1",
>     "creosote>=3.0.2",
>     "debugpy>=1.8.6",
>     "isort>=5.13.2",
>     "mdformat-black>=0.1.1",
>     "mdformat-footnote>=0.1.1",
>     "mdformat-frontmatter>=2.0.8",
>     "mdformat-gfm>=0.3.6",
>     "mdformat-tables>=1.0.0",
>     "mdformat>=0.7.17",
>     "mypy>=1.11.2",
>     "pycodestyle>=2.12.1",
>     "pylint>=3.3.1",
>     "python-dotenv>=1.0.1",
>     "rope>=1.13.0",
>     "ruff>=0.6.7",
>     "toml-sort>=0.23.1",
>     "vulture>=2.12",
102c102
<   "pytest>=8.3.3",
---
>     "pytest>=8.3.3",

While we're at it, I think the json.dumps call should have the following extra arguments to increase compactness and robustness, also against non-semantic changes:

allow_nan=False,
ensure_ascii=False,
separators=(',', ':'),