oras-project / oras-go

ORAS Go library
https://oras.land
Apache License 2.0
179 stars 97 forks source link

Proper naming for UpEdges/DownEdges methods #137

Closed Wwwsylvia closed 2 years ago

Wwwsylvia commented 2 years ago

Currently, we have an UpEdges method which returns the nodes directly pointing to the current node, and a DownEdges method which returns the nodes directly pointed by the current node. However, the UpEdges / DownEdges names may not be appropriate, and we are considering using other names.

The current candidates are:

Let's brainstorm for more names and do a vote!

Wwwsylvia commented 2 years ago

As we all know, naming is hard! 😋

sajayantony commented 2 years ago

Maybe @jdolitsky might have some nice ideas here. Things that get attached to an image, refer to the image manifest. So these are referrers of the manifest and the API in ORAS/OCI working group are all aligning with that so that sounds reasonable.

Looking from the attachment's point of view which is the source node, the image or blob is a sink node. I hope I am not wrong here - https://cseweb.ucsd.edu//~dasgupta/papers/poly.pdf

Given that these are polytrees it is ok to maybe call these are children since the Artifact points to its own blobs and a manifest which are all children of the source node here.

I'm actually thinking that we replace DownEdges ->

I considered descriptors but that makes things a bit harder when you write something like manifest.Descriptors since the manifest has a descriptor and just having a plural should not be the only differentiator.

Things get a bit harder for upEdges - If we extend this thinking a bit more, we could consider querying a blob to find all its referrers and if it's a common layer then all manifests that points to this blob (sink nodes) is a valid scenario. So with that is mind I'm leaning towards calling upEdges as Referrers since asking a descriptor for all elements that refers to it is what we are doing. Hope that helps with this. Either way renaming this before cutting a tag is what I am hoping for.

Wwwsylvia commented 2 years ago

Thanks @sajayantony for the insights. I've created a poll fort this - See #144 .

Wwwsylvia commented 2 years ago

I'm now actually prefer Parents / Children because it is at least easy to understand for users. 😄

sajayantony commented 2 years ago

The issue I have with the term parent is that the referrers are not parents of the manifest. It is like saying the manifest is the parent of the blob, which it isn't. Also the signature is not a parent of the image. The image needs to exist before the signature and might seem confusing. Again my vote would for something like

func Referrers(ref string) ([] descriptors) // query all manifests that are linked to the passed in reference. func Children(ref string) return some array of descriptors with type information. // Query all blobs is what I am understanding.

Currently does DownEdges return all blobs including the config blob. How do you differentiate this? Is it by ordinality?

shizhMSFT commented 2 years ago

Currently does DownEdges return all blobs including the config blob. How do you differentiate this? Is it by ordinality?

Yes, the returned blobs are ordered. The first one is always config if config is not optional for that particular manifest type. Also, the layers remain the original order.

Wwwsylvia commented 2 years ago

Another option is Predecessors/Successors or DirectPredecessors/DirectSuccessors, which are the academic terms for "up edges" and "down edges" of a node in a directed graph. It's mentioned in many books such as Understand Mathematics, Understand Computing. image

Wwwsylvia commented 2 years ago

It is like saying the manifest is the parent of the blob, which it isn't. Also the signature is not a parent of the image.

Agreed. But is the blob a child of the manifest, when the manifest is not the parent of the blob?

I considered descriptors but that makes things a bit harder when you write something like manifest.Descriptors since the manifest has a descriptor and just having a plural should not be the only differentiator.

Combining References and Descriptors, how about ReferencedDescriptors or DescriptorReferences ? So that we have Referrers/DescriptorReferences. @sajayantony

sajayantony commented 2 years ago

For the last exercise of API design on another SDK we prototyped how to calls would look in a small snippet and evaluated what aligns Best with the usage of the API. I think if we did a similar exercise it might help us come to a decision faster.

Would you be able to share how you would call These APIs with some contextual code around the calling method?

Wwwsylvia commented 2 years ago

Actually oras-go v2 provides two sets of interfaces based on different perspectives.

Note that the remote repository target provides both UpEdges method and Referrers method. (See https://github.com/oras-project/oras-go/blob/main/registry/remote/repository.go#L323-L356)

The UpEdges and Referrers methods both can be used for implementing oras discover. For DownEdges, it is used in the Copy function, but it may not be directly used by clients.

Wwwsylvia commented 2 years ago

Here are some example snippets of how to use the Referrers and the Predecessors (previously UpEdges) methods for oras discover:

Discover artifact referrers against a remote repository

func main() {
    // remote repository discover
    ref := "myregistry.azurecr.io/foobar:latest"
    artifactType := "application/vnd.oci.image.manifest.v1+json"
    repo, err := remote.NewRepository(ref)
    checkError(err)

    ctx := context.Background()
    manifestDesc, referrers, err := discoverArtifacts(ctx, repo, ref, artifactType)
    fmt.Println("manifest digest:", manifestDesc.Digest)
    for _, r := range referrers {
        fmt.Println("referrer digest:", r.Digest.String())
    }
}

func discoverArtifacts(ctx context.Context, repo *remote.Repository, ref, artifactType string) (ocispec.Descriptor, []artifactspec.Descriptor, error) {
    desc, err := repo.Resolve(ctx, ref)
    if err != nil {
        return ocispec.Descriptor{}, nil, err
    }

    var res []artifactspec.Descriptor
    if err := repo.Referrers(ctx, desc, func(referrers []artifactspec.Descriptor) error {
        for _, referrer := range referrers {
            if artifactType == "" || referrer.ArtifactType == artifactType {
                res = append(res, referrer)
            }
        }
        return nil
    }); err != nil {
        return ocispec.Descriptor{}, nil, err
    }
    return desc, res, nil
}

Discover referrers against an OCI store

func main() {
    // cached directory discover
    ref := "latest"
    rootDir := "foobar"
    cache, err := oci.New(rootDir)
    checkError(err)

    ctx := context.Background()
    manifestDesc, referrers, err := discover(ctx, cache, ref)
    fmt.Println("manifest digest:", manifestDesc.Digest)
    for _, r := range referrers {
        fmt.Println("referrer digest:", r.Digest.String())
    }
}

func discover(ctx context.Context, target oras.GraphTarget, ref string) (ocispec.Descriptor, []ocispec.Descriptor, error) {
    desc, err := target.Resolve(ctx, ref)
    if err != nil {
        return ocispec.Descriptor{}, nil, err
    }

    predecessors, err := target.Predecessors(ctx, desc)
    if err != nil {
        return ocispec.Descriptor{}, nil, err
    }

    return desc, predecessors, nil
}

The discover function in the second example is a general one, it's not limited to manifests. Given a layer blob descriptor, it will return the manifests referencing it.

sajayantony commented 2 years ago

Just distilling it down to compare the two, I vote for Referrers.

A blob can have the manifest as its referrer similar to a signature or artifact can referrer another manifest or blob.

Secondly - I also think a manifest can have a multi-arch manifest be its referrer.

Discover Referrers against a remote repository

ref := "myregistry.azurecr.io/foobar:latest"
repo, err := remote.NewRepository(ref) //checkerror     
desc, err := repo.Resolve(ctx, ref) //checkerror 

if err := repo.Referrers(ctx, desc, func(referrers []artifactspec.Descriptor) error {
    for _, referrer := range referrers {
        fmt.Println(referrer.Digest)
    } //checkerror 
});

Find predecessors against an OCI store


target, err := oci.New("/tmp/dataroot") //checkerror
desc, err := target.Resolve(ctx, "latest") //checkerror 
if err != nil {
    return err
}

predecessors, err := target.Predecessors(ctx, desc)
Wwwsylvia commented 2 years ago

@sajayantony Are you OK with renaming UpEdges to Predecessors?

sajayantony commented 2 years ago

Looking at that sample - nope. I honestly prefer referrers.

sajayantony commented 2 years ago

Had a coversation with @shizhMSFT on this. And he feels strongly that we are implementing a graph here and so Successors means all nodes pointed to from the current node and predecessors means all nodes that are pointing to the current node.

If we use that then we should give an explanation in the functions and interfaces of what an example successor and example predecessor would be.

sajayantony commented 2 years ago

I'm hoping this issue makes it in before releasing a 2.0 alpha of this package.

Wwwsylvia commented 2 years ago

I'm hoping this issue makes it in before releasing a 2.0 alpha of this package.

@sajayantony Sure, I will do this once #150 is merged.