host-based image names: 'host' vs. 'authority'

wking commented 6 years ago

@cyphar's distribution-uri ABNF used authority, but I used host in #2 because:

Where they're needed, I expect users to be providing userinfo and port information via other channels, especially since docker etc. currently use : to delimit tags.

In #5 (here) and #7 (here), @xiekeyang has wanted support for localhost:8080/…. I understand that is useful for testing, but don't think it should part of the host-based image name spec.

The .well-known URL lookup functions over both HTTP and HTTPS (e.g. see here). But if you are extracting the port from the image name, there's no way to know which protocol to use except by attempting both on that port. Trying to connect to a server using the wrong TLS protocol and finding out that you picked the wrong protocol (which is what would happened if we tried HTTPS and HTTP on localhost:8080) seems like a bigger issue than trying to connect to a server on the usual port for the protocol and finding out that the server is not listening on that port at all (which is what happens in our current protocol polling).

And I see no value to being able to specify explicit ports in production. Do we really expect production users to enter example.com:8080/… image names? Why would the example.com admins not be serving those images from 443 or 80?

There is a benefit for testing, because you don't need root on your host box to test if you can point the ref-engine discovery client at a high port. But I think we should find a way to work around that in testing (some ideas here, although we don't have anything trivial yet), instead of relaxing the production host-based image name spec.

xiekeyang commented 6 years ago

With #11 addressed, {HOST}:{PORT} template can be supported, which I think is valuable, especially on private system. E.g. some teams have their own HUB, might with different image format, under one company's cloud platform. All distribution or discovery system hosts just use {IP}:{PORT}. The discovery implementation should allow any person to discover images on these host. So the name rules should be less restricted.

wking commented 6 years ago

On Tue, Sep 12, 2017 at 06:26:14AM +0000, xiekeyang wrote:

E.g. some teams have their own HUB, might with different image format, under one company's cloud platform. All distribution or discovery system hosts just use {IP}:{PORT}.

The OCI Ref-engine Discovery protocol is just one of many possible approaches 1. And folks who use the protocol to discover ref-engines for other hosts SHOULD be using a different path for their ref-engines resource 2. So I'd rather address this problem by having the user use a different ref-engine discovery protocol, although they can certainly choose a related one if they want.

I see two flavors of the private-team-hub usecase:

a. All images should be resolved via the private hub. b. Some subset of images should be resolved via the private hub, but others should be resolved via the global X.509/DNS delegation 3.

In case (a), the easiest approach would be to skip ref-engine discovery completely, and just distribute the ref-engine(s) directly. For example, instead of saying “configure your resolver to use OCI Ref-Engine Discovery at http://192.168.1.2:8080” in your onboarding email, say “configure your resolver to use OCI Index Template at http://192.168.1.2:8080/{host}/{path}”.

In case (b), either pick an image-name prefix that you're ok clobbering or use a non-host prefix. For example, clobbering the empty-string or $TEAM_NAME hosts are both unlikely to severly restrict your access to the global X.509/DNS namespace. Then configure your resolver to skip straight to your internal ref-engine(s) directly, with an onboarding email like “configure your resolver to use OCI Index Template at http://192.168.1.2:8080/{host}/{path} for images with the ‘cool-widget’ host”.

The only thing you loose by pushing those straight through to ref engines is that, if you decide to change your ref-engine, you'll need the team to reconfigure their resolvers. But I don't expect ref-engine changes to be frequent. If you are concerned, you can always refer them to a ref-engine discovery service instead: “configure your resolver to request application/vnd.oci.ref-engines.v1+json from http://192.168.1.2:8080/ref-engine-discovery [for images with the ‘cool-widget’ host]”.

The discovery implementation should allow any person to discover images on these host. So the name rules should be less restricted.

Are you saying that you want folks outside the team to be able to resolve images via the team's private hub? I don't see a reason to support that globally; global images should be resolved via a global ref-engine discovery service (OCI Ref-engine Discovery or otherwise). But maybe you want a way to share images internally between two departments (dev, testing, and ops?). I'd support that via the same approach I recommend above, where you tell team members how to configure their tools directly, e.g. telling a dev member to use OCI Index Template at http://192.168.1.2:8080/{host}/{path} for resolution but to publish to the OCI Index Template at http://192.168.10.2:8000/{host}/{path} when they want to push it to testing (although these specs don't cover publishing, you'd need to give them more specific information like 4).

All of those cases are easy to support with the extra information distributed to team members (during the on-boarding process, and then again if you make incompatible changes to your setup). Do we have a need to support those use cases directly with host-based image names 3? Host-based images are useful because the X.509/DNS infrastructure delegates ownership for them 5. debian.org, docker.com, etc. can all be traced back to real-world owners who are using OCI Ref-engine Discovery and other protocols to declare the ref-engines they prefer for resolving images that the host-based image name spec maps to their domains. Supporting explicit ports in the host-based image name spec gives domain controllers a chance to have multiple, simultaneous opinions, and there's no clear semantic distinction about the meaning of different opinions between debian.org:443 and debian.org:8080, etc. Of course, the current spec also supports nonsense subdomains, so you could have debian.org/… and 8080.debian.org/… image namespaces with the same ambiguity, even without explicit ports. The host-based image name spec doesn't jump through hoops to try and block that sort of thing (and I'm not sure how it could if it wanted to), but I don't see a point to relaxing the spec to support explicit ports or other namespacing that cannot express semantic differences. I don't see a use for that sort of thing in the global namespace.

cyphar commented 6 years ago

Just to be clear, the reason why I have stopped commenting on this project is because it's getting quite tiring and it looks like there's no interest to just use parcel (with some updates which I've mentioned to you both before). But I will comment on this point:

Where they're needed, I expect users to be providing userinfo and port information via other channels, especially since docker etc. currently use : to delimit tags.

: based tags were a complete mistake. Let's not repeat the mistakes of Docker. We should be using sane URI tagging (namely #fragments). This is why I argued for allowing fragment usage in the OCI spec.

wking commented 6 years ago

... it looks like there's no interest to just use parcel (with some updates which I've mentioned to you both before)...

I'm fine with using parcel. Once you get the discussed updates in, I think it will look pretty close to this.

We should be using sane URI tagging (namely #fragments). This is why I argued for allowing fragment usage in the OCI spec.

Absolutely agree, see here. But lots of folks are used to them, so low-colon image names may help them adjust.

And I still don't like explicit ports in host-based image names for the other reasons given in this issue.

cyphar commented 6 years ago

I'm fine with using parcel. Once you get the discussed updates in, I think it will look pretty close to this.

Cool, that's not the impression I was getting :smile_cat:. I am going to work on those updates for the next two weeks, with the eventual plan of merging parcel into the umoci project (with a separate binary for fetching of course, but it'll be using the umoci libraries).

wking commented 6 years ago

...with the eventual plan of merging parcel into the umoci project (with a separate binary for fetching of course, but it'll be using the umoci libraries).

I think we want a discovery spec like the ref-engine discovery spec, the OCI index template protocol, and the ref engine registry belong in either image-spec or a new discovery-focused OCI spec prpject. I think the OCI CAS template protocol and CAS engine registry belong in image-spec alongside its current urls. I don't see anything except alernative ref/CAS protocols and implementations ending up outside the OCI in umoci and such. But I'm not a maintainer for any of this, and I've been wrong before ;).

cyphar commented 6 years ago

I think that the spec should be in the OCI, but as I've said previously, the process to follow before we propose it as an OCI spec is:

Write a draft spec document.
Implement said draft.
Use the implementation to make sure that it's sane.
Accept feedback from the community.
Propose it to the OCI.

I have started on 1 and 2. I will be doing 3 very soon once I've incorporated some of the updates we've discussed. 4 is going to happen after that quite naturally (I've talked to folks from quite a few interested parties that want to contribute and I've told them to hold off for a few weeks).

wking commented 6 years ago

I have started on 1 and 2...

And this repo is already through them for Python, with WIP on Go. Any differences between it and parcel's adjusted spec will show up in the implementations, which will help inform community feedback in both repos, which will in turn lead to repo changes. Eventually the TOB will like a discovery spec enough to put it under the OCI umbrella.

cyphar commented 6 years ago

My point is that "just push to OCI" (which is what it sounded like you were suggesting) doesn't make sense as a plan. We cannot skip (3) and (4). I will be doing (3) on both the openSUSE and SUSE side (as well as some other folks that have said they're interested in using parcel once I've finished reworking the spec). If the draft spec and implementation live in umoci I don't see why that is an issue for pushing to OCI (in fact it's a good thing because it would give more publicity and be easier for people to use the implementation).

wking commented 6 years ago

My point is that "just push to OCI" (which is what it sounded like you were suggesting) doesn't make sense as a plan. We cannot skip (3) and (4).

runtime-spec had two years of review and feedback under the OCI before cutting 1.0.0. I don't see why discovery needs to be fully baked before becoming an OCI Project (a new one, or part of image-spec).

If the draft spec and implementation live in umoci I don't see why that is an issue for pushing to OCI (in fact it's a good thing because it would give more publicity and be easier for people to use the implementation).

It's nothing insurmountable, but I'd like to see multiple implementations and a stand-alone spec, as we have for other OCI specs. Mixing it inwith everything else that umoci does makes those separations less obvious.

cyphar commented 6 years ago

runtime-spec had two years of review and feedback under the OCI before cutting 1.0.0.

And it was based on other implementations and specifications that had already been proved to be working. The same applies for image-spec. The biggest issue in my mind with just pushing it to OCI is that you have to deal with a lot of disagreements and so on, it's much simpler to write a specification that works and show it works before you submit it for further improvements.

That's how most specs actually end up being developed. Nobody sits down and says "let's design a protocol by first asking everyone to start talking at once", because that way nothing gets done. (Actually some specs are like that, and I think their general (lack of) quality and sanity justifies my skepticism in trying to get an unfinished and untested spec into a spec body).

It's nothing insurmountable, but I'd like to see multiple implementations and a stand-alone spec, as we have for other OCI specs.

And then someone who wants to use the spec has to download at least 3 different projects just to even begin playing with it. I prefer the rkt model of spec development, you write a spec document as part of an umbrella project and then you can spin it out into a self-contained project after it's incubated for a while.

In any case, I'm probably going to end up doing that anyway and we'll see where it takes us. I'll probably just link from cyphar/parcel to openSUSE/umoci (or maybe host the spec document in that project).

wking commented 6 years ago

The biggest issue in my mind with just pushing it to OCI is that you have to deal with a lot of disagreements and so on, it's much simpler to write a specification that works and show it works before you submit it for further improvements.

This is "consensus building". I agree that it's difficult, but feel that it's worth doing, and is easier the earlier you start, because there's not so much mass to block improvement.

And then someone who wants to use the spec has to download at least 3 different projects just to even begin playing with it.

If they want to alter the whole stack. But this repo uses several very-weakly-coupled micro specs to separate concerns. If tou want to use a different host-based image name approach, you can edit host-based-image-names.md and optionally oci_discover/host_based_image_names, and as long as yournew approach produces fields with the same names, leave the other specs and implementations in this repo alone.

In any case, I'm probably going to end up doing that anyway and we'll see where it takes us.

Ok. Let me know if/when you're ready for issues and PRs.

cyphar commented 6 years ago

The problem isn't that it's difficult, it's that in my experience it usually results in far lower-quality specifications. Also I was talking about a user, not someone trying to actually change the specification. I want users to be able to use this from very early on (the whole point of a fediverse is that users can actually understand how it works and how to use it).

But yes, give me a few weeks then I'll ping all of the relevant folks for comments.

wking commented 6 years ago

Also I was talking about a user, not someone trying to actually change the specification.

This should be a non-issue, with Go/Python/... package managers automatically pulling in any ancestor dependencies.

cyphar commented 6 years ago

That's not really how packaging works. You wouldn't make umoci require parcel, skopeo, runc, etc. It's already confusing enough to explain to people that you need umoci, skopeo, and runc in order to do anything useful with an OCI image.

wking commented 6 years ago

On Mon, Sep 18, 2017 at 01:28:50PM -0700, Aleksa Sarai wrote:

That's not really how packaging works.

That's absolutely how packaging works. For example, see umoci pulling in go-digest 1 and urfave/cli 2. In Python, you can automatically pull in dependencies at install-time by declaring install_requires 3.

It's already confusing enough to explain to people that you need umoci, skopeo, and runc in order to do anything useful with an OCI image.

These are related command-line tools; they're not library dependencies. You could provide some assistance in discovering them using a generic package manager (e.g. “Suggests” in a Debian control.tar 4). But the point of the modular OCI ecosystem is that it's pluggable. Want to use a runtime other than runc? Swap it in and continue to use your original image-manipulation / distribution stack. Or vice versa, swap out your image-manipulation stack, and keep using your runtime and distribution stacks. Etc., etc.

Folks who want one-stop shopping can use a wrapper that combines several of these modular components into a single tool.

But that's not a library-level dependency issue. We already have lots of library-level dependencies that are not causing problems, because Go, Python, and many other languages provide language-specific package management tools to automatically pull in any third party dependencies as part of package installation when the user asks for it. And obviously language-agnostic package managers can handle this sort of thing for you as well.

cyphar commented 6 years ago

None of that message appears related to what I was saying.

wking commented 6 years ago

On Tue, Sep 19, 2017 at 12:56:26AM +0000, Aleksa Sarai wrote:

None of that message appears related to what I was saying.

Let me attempt to summarize how I got there.

In 1, you seem to suggest moving the current cyphar/parcel content (“draft spec and implementation”) into umoci.

In 2, I pushed back, suggesting a stand-alone spec as a way to get clearer separation between the spec and implementations.

In 3, you reiterated your preference for a combined spec/implementation repo, citing rkt. And you claimed that someone would have to download at least three different projects to begin playing with it.

In 4, I pushed back against the three-project-download example using an “edit the host-based image names” example.

In 5, you explained you were talking about end users, not spec-editors.

In 6, I claimed langauge-specific package managers would handle dependency installation without trouble.

In 7, you pushed back, suggesting that having umoci require parcel and other projects would be confusing.

In 8, I pushed back again, pointing out that umoci already has library dependencies on go-digest and urfave/cli which work out fine, and suggesting that an additional library dependency on parcel would also work out fine.

So that's why I believe 8 is related to what you were saying.

And while we're being careful about wording, I do agree that if you make parcel a library dependency of umoci, that umoci users will have to download parcel. I'm just saying that ‘go get …’, ‘godep …’, etc. will do that for them automatically, without you having to explicitly say something like “umoci depends on parcel, so take these extra steps…” in your umoci docs (just as you currently say nothing about the go-digest and urfave/cli dependencies).

And you're making other points for a single spec-and-implementation repo:

Higher visibility by riding on umoci's coattails 9.
Poor previous experience with design-by-committee 10.

Both of which are independent of the presence (you seem to feel) or absence (I feel) of technical difficulties in developing orthogonal libraries in separate repos. Visibility is good, altough I don't think it's worth fuzzing separation of concerns. And I'm comfortable with design-by-committee, although obviously the success will depend on the committee ;). But I'm not going to tell you what kind of experience you've had. So if you're comfortable dumping it all into the umoci repo, then more power to you. As I said, it's nothing insurmountable 2.

cyphar commented 6 years ago

I think you missed this, which explains why you're talking about fairly unrelated topics.

(with a separate binary for fetching of course, but it'll be using the umoci libraries).

wking commented 6 years ago

I think you missed this, which explains why you're talking about fairly unrelated topics.

(with a separate binary for fetching of course, but it'll be using the umoci libraries).

So maybe I got the direction wrong, and was expecting the library code currently in umoci would consume the library code currently in parcel, while you're planning on having the library code currently in parcel consume the library code currently in umoci?

Or you may be saying that the library code currently in umoci and parcel is going to be so intertwined that it will be only worth thinking about as one library?

Either way, I can just wait and see what you do.

xiekeyang / oci-discovery

host-based image names: 'host' vs. 'authority' #9