Open james-callahan opened 2 years ago
Thanks for your report.
This sounds reasonable; we’d need to invent a file format.
Looking at the output in https://github.com/containers/skopeo/pull/1459#pullrequestreview-765110575 , a possible alternative might be to add that as a field to the semi-structured log output (probably still opt-in not to pollute the ordinary CLI too much) — and to allow using a truly structured log destination, maybe in the JSON format supported by logrus.
I’m not at all sure tying this with logging is worthwhile; it is an approach to have a file format that clearly scales to any new information; OTOH having a single-purpose --digestfile
that just contains repo:tag@digest values one per line would not really be a problem, we can always add one more CLI option and one more file format…
I guess that’s up to the person working on this. Care to prepare a PR?
This sounds reasonable; we’d need to invent a file format.
I think newline-delimited would be sufficient.
In particular this feature should work with --dry-run
(#1459)
I’m not at all sure tying this with logging is worthwhile
Agreed; I don't really see it as a logging thing: --digestfile
of course exists, and not just as an output log line.
In particular this feature should work with --dry-run (#1459)
Hum, that’s interesting but more borderline. Note that the digest at the source and the destination can be different (e.g. if the destination registry doesn’t support a format, and a conversion is required, or if the user explicitly requires a format conversion). It would be very surprising for --dry-run
and --dry-run=false
to both support --digestfile
but to output different values.
The plausibly possible output of --digestfile --dry-run
, listing just the source data, can probably be scripted with a few lines of shell along the lines of
for tag in skopeo list-tags $sourceRepo | convert-tags-to-a-shell-array; do
echo "$tag $(skopeo inspect --format '{{.Digest}}' $sourceRepo:$tag)"
done
(maybe not exactly this, and this one is a bit less efficient because it fetches the config as well), so I’m not even quite sure that this operation directly relates to the “sync” concept.
What do you plan to use the digests for?
Note that the digest at the source and the destination can be different (e.g. if the destination registry doesn’t support a format, and a conversion is required, or if the user explicitly requires a format conversion). It would be very surprising for
--dry-run
and--dry-run=false
to both support--digestfile
but to output different values.
Would that be fixed with --dest-precompute-digests
?
The plausibly possible output of
--digestfile --dry-run
, listing just the source data, can probably be scripted with a few lines of shell along the lines of
When using the yaml source type, it's at leat a little more advanced than this.
What do you plan to use the digests for?
We have a list of public images that we mirror to our private repo. Our engineers put in a request to add a specific image docker.io/someorg/somerepo:sometag@sha256:somedigest
. They then need to know what the new image reference (with checksum) will be images.company.com/mirror/docker.io/someorg/somerepo@sha256:somenewchecksum
.
Currently we iterate over the list of images requested by engineers and pass them to skopeo copy --digestfile
, I'm hoping we can move to the YAML format accepted by skopeo sync
.
Note that the digest at the source and the destination can be different (e.g. if the destination registry doesn’t support a format, and a conversion is required, or if the user explicitly requires a format conversion). It would be very surprising for
--dry-run
and--dry-run=false
to both support--digestfile
but to output different values.Would that be fixed with
--dest-precompute-digests
?
That would really involve making a copy and throwing away the result (where “throwing away the result” is a new ~feature), in the worst case, because it needs to read and decompress all data (if the input is schema1), and possibly compress all data (if the input is uncompressed). And the compression algorithms are not, AFAIK, guaranteed to generate the same output on repeated invocations, so it’s just not viable — in the fully general case. It might be possible in some specific situations, but it’s not trivial.
The plausibly possible output of
--digestfile --dry-run
, listing just the source data, can probably be scripted with a few lines of shell along the lines ofWhen using the yaml source type, it's at leat a little more advanced than this.
That’s very fair.
What do you plan to use the digests for?
We have a list of public images that we mirror to our private repo. Our engineers put in a request to add a specific image
docker.io/someorg/somerepo:sometag@sha256:somedigest
. They then need to know what the new image reference (with checksum) will beimages.company.com/mirror/docker.io/someorg/somerepo@sha256:somenewchecksum
.
Hum, so this does actually run into the hard case of image conversion (and if so, from what format to what format)? Or is that some other format change, like compression? In particular, would #1378 be interesting?
Hum, so this does actually run into the hard case of image conversion (and if so, from what format to what format)?
I don't think so? Or at least, I've never thought about it. Originally we did docker pull && docker tag && docker push
, then moved to buildah pull && buildah tag && buildah push
, and now skopeo copy
.
A friendly reminder that this issue had no activity for 30 days.
A friendly reminder that this issue had no activity for 30 days.
When using
skopeo sync
, I'd love a way to easily get the digests of the images in their new location, similar to ifskopeo copy
had been repeatedly invoked with--digestfile
(and collecting them into one file).