Closed danielballan closed 11 months ago
Both warnings are coming from setuptools_scm
which should not be involved at all here.
$ git grep setuptools
$ git grep scm
wayback/tests/cassettes/test_get_memento_with_mode.yaml: 7w5WeXtQfkKRkGAp0pGfR1RcdwJ/Icvl2yF224gXjAIxvhHbqp1Jscm4q3p14g1BsPXNsf5ULihF
wayback/tests/cassettes/test_get_memento_with_redirect_in_view_mode.yaml: sveDbKr6kPX2ckJ3W5SDscmkXJxys6ze1fbXSulgQJfqptwDsjuv8fX3JDbOZ+dBDk99KXZJwqxG
wayback/tests/cassettes/test_get_memento_with_redirect_in_view_mode.yaml: UgTAFn1wQkiYh3FhdMbW6UFdbsFxaFJscmV5MTeT+cTJdSW5KCVN10boqBiDcX4r5+yqJQqyADkw
I’m new to build
, but is it correct that it’s making a second isolated environment and building a wheel even though it already built an sdist and a wheel?
It looks like the warnings are coming from the second build, and seems strange (without knowing any better) for that second build to even be happening. I noticed that if I just run python -m build --wheel .
or python -m build --sdist
there are no warnings.
Well, this seems to happen for me on a more-or-less blank repo (e.g. what you have if you follow the PyPA tutorial), so I think it may be an intrinsic issue with the current combination of build, hatchling, and setuptools-scm. I’m definitely not familiar enough with the details to diagnose the problem well, but based on skimming the source:
.git_archival.txt
. If one doesn’t work, it keeps going.git_archival.txt
that has real values substituted in, i.e. the version that would be in an actual archive produce with git archive <commit-ish>
.describe
directive, it warns "git archive did not support describe output"
. Then if it looks like the whole file hasn’t been processed at all (i.e. it’s what’s in the git repo rather than in an archive) it warns "unprocessed git archival found (no export subst applied)"
, making the first warning a little redundant and confusing. This appears to be what we’re seeing..git
database or by calling git
or whatever. Possible other evidence that this is normal is that they set pytest to ignore this warning: https://github.com/pypa/setuptools_scm/blob/83d5e1479970a3ea559fe11553e9350ae694e91c/pyproject.toml#L120-L124It seems like the above is how setuptools-scm has behaved since v7. Pinning setuptools-scm==6.4.2
(latest v6.x) in [build-system]
seems to work and not show the error, but if this is expected, it's probably better to live with the error?
Alternatively, maybe build or hatchling is intending to work against an actual git archive instead of the working directory, but is somehow winding up on the actual working directory, or a copy of it instead?
I’m also noticing setuptools_scm
excludes the file in their own project’s MANIFEST.in
: https://github.com/pypa/setuptools_scm/blob/83d5e1479970a3ea559fe11553e9350ae694e91c/MANIFEST.in#L4
Also, they put one that does get included in src/setuptools_scm
. If we do the same, there’s no warning. (Of course, it might not find it where it’s looking in an actual archive then, either? Seems weird.)
Anyway, I’m done for the day. Hopefully some of the above will be useful for you.
I’m new to build, but is it correct that it’s making a second isolated environment and building a wheel even though it already built an sdist and a wheel?
Yes. From python -m build --help
:
By default, a source distribution (sdist) is built from {srcdir}
and a binary distribution (wheel) is built from the sdist.
This is recommended as it will ensure the sdist can be used
to build wheels.
Pass -s/--sdist and/or -w/--wheel to build a specific distribution.
If you do this, the default behavior will be disabled, and all
artifacts will be built from {srcdir} (even if you combine
-w/--wheel with -s/--sdist, the wheel will be built from {srcdir}).
If you are looking at a live git repo, not an archive, it seems like this is probably fine, and is maybe expected operation.
This possibility occurred to me. Maybe this is fine....
CI failures are related to cache key:
error computing cache key: template: cacheKey:1:35: executing "cacheKey" at <checksum "requirements.txt">: error calling checksum: open /home/circleci/wayback/requirements.txt: no such file or directory
and version not being set:
wayback/__init__.py:1: in <module>
from ._version import __version__, __version_tuple__ # noqa: F401
E ImportError: cannot import name '__version__'
=========================== short test summary info ============================
ERROR - ImportError: cannot import name '__version__'
Locally, the version is being set as expected:
>>> import wayback
>>> wayback.__version__
'0.4.5.dev4+ga150311'
I updated the cache key and build script in .circleci/config.yml
, which mostly fixed all the CI failures. The one remaining thing is that check-wheel-contents
is failing on some VCR cassette files that are duplicates:
Checking dist/wayback-0.4.5.dev7+g289355e.d20231203-py3-none-any.whl: PASSED
Checking dist/wayback-0.4.5.dev7+g289355e.d20231203.tar.gz: PASSED
dist/wayback-0.4.5.dev7+g289355e.d20231203-py3-none-any.whl: W002: Wheel contains duplicate files:
wayback/tests/cassettes/test_get_memento_handles_non_utc_datetime.yaml
wayback/tests/cassettes/test_get_memento_with_string_datetime.yaml
I think we didn’t have that error before because we excluded test files with:
[tool.setuptools.packages.find]
where = ["."]
exclude = ["docs", "tests"]
So this either needs the equivalent config for Hatchling or config for check-wheel-contents
to skip the cassette files.
I think it’s probably better to continue not including test files, but I don’t have especially strong feelings about it. Either approach is probably OK if you have strong feelings the other way.
I’m new to build, but is it correct that it’s making a second isolated environment and building a wheel even though it already built an sdist and a wheel?
Yes. From
python -m build --help
:
Yeah, I had read that, and was reacting to the text:
By default, a source distribution (sdist) is built from {srcdir} and a binary distribution (wheel) is built from the sdist.
While the log output made it look like it was building once from the sdist and then again from source:
* Building wheel from sdist
...
* Building wheel...
But I just started digging into the warning on a simplified repo using setuptools instead of hatchling, and setuptools does a lot more logging — I can see now that the * Building wheel...
log line is part of the * Building wheel from sdist
code path, not a second, separate build, which is what I had thought it was before.
That also makes it clearer what’s causing the warning, and why we only see it when doing build .
and not during build --sdist
or build --wheel
:
Build builds an sdist from the working directory. setuptools-scm finds the version from git — no problem, no warnings.
To build the wheel, build unpacks the sdist into a temp directory, and then builds a wheel. When it tiggers setuptools-scm on the unpacked sdist, it’s not in a git checkout anymore (so can’t get info from git) so it looks at .git_archival.txt
. But since the sdist was not packed up by doing git archive
, that file doesn’t have any git metadata in it, and so setuptools-scm logs the warning.
Running python -m build --wheel
doesn’t have this problem, because that builds the wheel from the working git directory instead of from the sdist.
Anyway, this makes sense now, although it is kind of a crappy situation. Do you know if there’s a way to have build (or hatchling?) run the right git machinery to get the right values substituted into .git_archival.txt
when building the sdist? It seems like that’s the right fix here, since you’d theoretically want that file correctly populated in the sdist as if it were an actual git archive.
Or alternatively: maybe the right way to do this is to build with:
# From the root of the repo:
git archive --output ../build_dir/source_archive.tar <tag-to-build>
cd ../build_dir
tar -xf source_archive.tar
rm source_archive.tar
python -m build .
That way you are building the sdist with a properly archived/archive-ready file tree.
Re: “exclude the file from builds,” I tripped over this pyproject.toml
snippet today while reading about something else, and I think it’s what we’d want to use here:
[tool.hatch.build.targets.sdist]
exclude = [".git_archival.txt"]
(The same article also suggests removing test files from wheels but not sdists, which I hadn’t thought of before. Interesting.)
Let me a little digging around some other projects and the behavior of their builds. I wonder if (i) is better, though I agree you have found the right implementation of (ii) if we go that way.
someday I’d like to script the whole release process in CI, but I think that’s out of scope here
Happy to tackle this next.
I usually include the tests. The “web deployment” side of Python, interested in small containers and fast CD, pushes to keep the tests out, while the “science” side of Python, where end users also sometimes dip into light development, almost always bundles them.
The sdist/wheel breakdown is a reasonable compromise.
I am on international travel for work; I left a note to resume this Monday when I am back at my keyboard.
That all makes sense to me!
I am on international travel for work; I left a note to resume this Monday when I am back at my keyboard.
Sounds good. I’m hoping to include this in a v0.5.0 release before the end of the year, but otherwise there’s no rush. (Besides this I want to do some final rate limiting cleanup for #137 and to finally get thread safety by dropping requests, but we’ll see; I may be being to hopeful there.)
someday I’d like to script the whole release process in CI, but I think that’s out of scope here
Happy to tackle this next.
👍
Reading a bit, I agree that excluding .git_archival.txt
from the bdist is the best way to resolve these warnings. Commits added.
Googling the current error in CircleCI, I found a post from one @Mr0grog: https://discuss.circleci.com/t/upgrading-to-the-convenience-image-breaks-everything/42746
Do we need to further adjust our cache key here?
Googling the current error in CircleCI, I found a post from one @Mr0grog: https://discuss.circleci.com/t/upgrading-to-the-convenience-image-breaks-everything/42746
Do we need to further adjust our cache key here?
🤣 Yep, I think that’s probably all you need to do.
Last issue:
dist/wayback-0.4.5.dev12+gcc638d8-py3-none-any.whl: W002: Wheel contains duplicate files:
wayback/tests/cassettes/test_get_memento_handles_non_utc_datetime.yaml
wayback/tests/cassettes/test_get_memento_with_string_datetime.yaml
Indeed these files have the same contents, but I think we should not assume that they always will forever, so the right thing here to make an exemption in the check:
$ md5sum wayback/tests/cassettes/test_get_memento_handles_non_utc_datetime.yaml wayback/tests/cassettes/test_get_memento_with_string_datetime.yaml
1ede1c2e263c5b56731d1c29156f9784 wayback/tests/cassettes/test_get_memento_handles_non_utc_datetime.yaml
1ede1c2e263c5b56731d1c29156f9784 wayback/tests/cassettes/test_get_memento_with_string_datetime.yaml
Huh, I would have thought that would be solved by https://github.com/edgi-govdata-archiving/wayback/commit/7be22d4b6c768195f2b9913267d917bcf1082119. Is that only addressing Python files, or does it need to be wayback/tests/**/*
?
Oh, no, I think it needs to be the wheel target:
[tool.hatch.build.targets.wheel]
Ah, indeed. Pushed.
Oh, I also wanted to ask: what pushed you towards excluding .git_archival.txt
rather than building the sdist from an archive?
I agree that excluding
.git_archival.txt
from the bdist is the best way to resolve these warnings.
Your argument made sense, I just look around a bit at some projects and confirmed that I could not find any examples that could convince me otherwise.
@danielballan would a good quick followup here be moving /wayback
to /src/wayback
?
This is part of the way there. Some warnings needs to be resolved. I haven't seen these before.