Open alexanderkjall opened 2 years ago
We didn't plan to include binary files in crate release to keep them light for the typical consumption—compilation as a dependency. Unfortunately, this conflicts with the dev-integration as a testable crate based on the archives alone. This is the experience across all image crates, and, admitedly, part of the motivation to keep this choice even for rather small test units was to make this problem more apparent.
Based on the png
problems, I did some prototyping on a dev-dependency that would provide the proper semantics–as far as I'm concerned. Can you provide some feedback for this approach? I have little experience with Debian packaging, and didn't get enough.
png
: https://github.com/image-rs/image-png/pull/304I'm glad to provide any additional guidance/dialogue (here, Matrix, Discord or e-mail me).
I like the problem statement of xtest-data, but the debian build system doesn't allow network access during builds.
The base premise is that debian should be buildable from the debian sources, in order to fulfill a number of guarantees
It's maybe possible to solve this in some other way with some trick with an optional dev-dependency that contains the test files, but I don't really know if that is realistic.
I might be able to patch the binary files back in as debian specific patches, it would maybe be a bit more fragile and more work when the package needs to be updated.
[…] but the debian build system doesn't allow network access during builds.
Is it possible to have a shallow git checkout of the repo data? Because the repository URL is merely a first guess, you can customize it to a local repository with CARGO_XTEST_DATA_REPOSITORY_ORIGIN
.
I think the dev-dependency approach could work. Have a second lzw-testcases
crate as a subdirectory of this repository that is included either via a path dependency (for git checkouts) and also via crates.io. I'd expect that Debian already has the infrastructure to handle this sort of setup?
My problem with lzw-testcases
as a crate is that even uploading to crates.io
is rather expensive (to their infrastructure) and feels wrong. Not to mention that they have arbitrary limits that will not scale to the test set we want for image crates:
crates.io currently has a 10MB size limit on the .crate file. (https://doc.rust-lang.org/cargo/reference/publishing.html#packaging-a-crate)
In any case I don't see a major difference packing a commit and packing the archive, however the former can re-use git's architecture for the checksumming that I would prefer to have for superior reproducibility.
The main reason that we base the packaging on the artifacts in crates.io is that those are the published versions, and that gives us more of an guarantee that the code that is packaged is the same code that other people are also using.
Other rust tooling, like the security advisory database (https://rustsec.org/) is also basing it's work on the crates that are distributed from crates.io. and it's nice to be able to reuse that work.
It's possible to package rust code in debian without using all our rust packaging infrastructure, and then it would be possible to package it based on a git repo i guess. But that would leave the package as it's own special snowflake that needs special handling for everything and personally I'm not really willing to increase the workload for the team just to get a couple of more tests working.
I think a better solution might be to add those files as patches to the original package in debian.
@alexanderkjall As a specific suggestion, let's say we export test data as a git-pack-objects
archive. Would such a second file, auxiliary to the .crate
, be acceptable for the Debian package sources?
It's trivial to automatically upload it to Github, as well as associate it with specific version tags in CI. A small bit of work would be to integrate it with the infrastructure of xtest-data
. That is, automatically create the minimal pack-archive, and allow importing from a local pack archive file instead cloning from a git remote/shallow clone. But then you just have one more source file/archive that one can handle fully independent from .crate
data.
We don't really have any support tooling around adding that kind of extra files, other than the patch system that adds patches, so I feel that it might be a bit premature for me to have any opinion on what system would be the easiest one :)
Sorry for not having a clearer answer, but i could come back to this topic once we have some more support infrastructure in place.
I've prototyped that system. Basically, just supplement the usual git fetch
by a precalculated object-pack. Then tests can import that object pack from disk. This mode no longer requires explicit confirmation because it does not access the network in the first place (it's an alternative). See https://github.com/HeroicKatora/xtest-data/pull/2
If you want to test integration in your tooling, grab:
tar xzf $src/xtest-data-1.0.0-beta.2.crate
, cd xtest-data-1.0.0-beta.2
, tar xzf $src/xtest-data.tar.gz
TMPDIR=/tmp CARGO_XTEST_DATA_PACK_OBJECTS="$(pwd)/target/xtest-data" cargo test
This is more of a 'for your information' rather than a bug report about something that is wrong in the project. But I thought it wouldn't hurt to inform upstream about it :)
The unit tests depend on a file named
/benches/binary-8-msb.lzw
that isn't included in the crate uploaded to crates.io.test output:
This means that we have to disable to tests when packaging this crate for debian. Would it be possible to include the
/benches/binary-8-msb.lzw
file in the next release?