oras-project / oras-go

ORAS Go library
https://oras.land
Apache License 2.0
172 stars 94 forks source link

Manifest should not be added to `index.json` #664

Closed eiffel-fl closed 7 months ago

eiffel-fl commented 8 months ago

Hi!

Manifests are added to index.json:

{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313",
      "size": 581,
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb",
      "size": 581,
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    }
  ]
}

This behavior can create troubles, as when the manifests in index.json will not be automatically deleted when the surrounding index is deleted:

francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo ls /var/lib/ig/oci-store/blobs/sha256                                 main % u=
225a8e697e260684d9451ccf8a622cd5548b0ae5fac6e13c416cd1d85247b7b7  b4be31f07444fb9be6dee8cd1349aab3c780d333b63406be10c5c14c1b39dd37
662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313  c06a86dad7e582686b53bec9c729b92fc56c838b59cc11d0105c2c20f3158dcd
b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1  f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb
francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo -E ./ig image remove execs                                            main % u=
INFO[0000] Experimental features enabled                
Successfully removed execs
francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo ls /var/lib/ig/oci-store/blobs/sha256                                 main % u=
225a8e697e260684d9451ccf8a622cd5548b0ae5fac6e13c416cd1d85247b7b7  c06a86dad7e582686b53bec9c729b92fc56c838b59cc11d0105c2c20f3158dcd
662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313  f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb
b4be31f07444fb9be6dee8cd1349aab3c780d333b63406be10c5c14c1b39dd37
francis@pwmachine:~/Codes/kinvolk/inspektor-gadget$ sudo cat /var/lib/ig/oci-store/index.json | jq                             main % u=
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb",
      "size": 581,
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313",
      "size": 581,
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    }
  ]
}

Sadly, this behavior is not straightforward to modify. So, I open this issue to discuss a proper design.

Best regards.

eiffel-fl commented 8 months ago

Hi!

I took a look at this and was able to avoid storing the manifests in index.json with the following patch:

diff --git a/content/oci/oci.go b/content/oci/oci.go
index 5c584f8..997543c 100644
--- a/content/oci/oci.go
+++ b/content/oci/oci.go
@@ -143,10 +143,6 @@ func (s *Store) Push(ctx context.Context, expected ocispec.Descriptor, reader io
        if err := s.graph.Index(ctx, s.storage, expected); err != nil {
                return err
        }
-       if descriptor.IsManifest(expected) {
-               // tag by digest
-               return s.tag(ctx, expected, expected.Digest.String())
-       }
        return nil
 }
$ sudo -E ./cmd/ig/ig image build -t execs -f gadgets/trace_exec/gadget.yaml gadgets/trace_exec
INFO[0000] Experimental features enabled                
Successfully built docker.io/library/execs:latest@sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1
$ sudo cat /var/lib/ig/oci-store/index.json | jq                            main *% u=
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    }
  ]
}

The code I removed with the above patch was added in 4e58192a513ea4cbda458dd696cf804b5d043dce which description is the following:

refactor: refactor OCI store to fully support Predecessors() and Resolve() (#385)

  1. Support discovering predecessors that are not tagged
  2. Support resolving a manifest by its digest
  3. Support resolving a blob by its digest

    Resolves: #34

So, I guess by removing the above code it breaks resolving a manifest by its digest? Meanwhile everything is OK with the above patch regarding tests:

root@8db5c3d638c6:/mnt/oras-go# make test
...
coverage: 78.5% of statements
ok      oras.land/oras-go/v2/registry/remote/retry      9.762s  coverage: 78.5% of statements

Moreover, applying the above patch on top of faaa1ddd3c0d2b47cb5594d4f352f948bb4901e3 and modifying few stuff on Inspektor Gadget side, I can have everything removed while removing an image:

$ sudo -E ./cmd/ig/ig image build -t execs -f gadgets/trace_exec/gadget.yaml gadgets/trace_exec
INFO[0000] Experimental features enabled                
Successfully built docker.io/library/execs:latest@sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1
$ sudo ls /var/lib/ig/oci-store/blobs/sha256                  francis/rmi-upstream *+%
225a8e697e260684d9451ccf8a622cd5548b0ae5fac6e13c416cd1d85247b7b7  b4be31f07444fb9be6dee8cd1349aab3c780d333b63406be10c5c14c1b39dd37
662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313  c06a86dad7e582686b53bec9c729b92fc56c838b59cc11d0105c2c20f3158dcd
b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1  f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb
$ sudo cat /var/lib/ig/oci-store/index.json | jq              francis/rmi-upstream *+%
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    }
  ]
}
$ sudo -E ./cmd/ig/ig run execs                               francis/rmi-upstream *+%
INFO[0000] Experimental features enabled                
RUNTIME.CONTAINERNAME                PID                 PPID                UID                 GID                 RE… COMM            
builder                              48390               28015               0                   0                   0   ls              
^C%                                                                                                                                      $ sudo -E ./cmd/ig/ig image remove execs                      francis/rmi-upstream *+%
INFO[0000] Experimental features enabled                
Successfully removed execs
$ sudo ls /var/lib/ig/oci-store/blobs/sha256                  francis/rmi-upstream *+%
$ sudo cat /var/lib/ig/oci-store/index.json | jq              francis/rmi-upstream *+%
{
  "schemaVersion": 2,
  "manifests": null
}

Am I missing something, or the above patch can really change this behavior without breaking everything? Your insights on this would be appreciated.

Best regards.

Wwwsylvia commented 8 months ago

So, I guess by removing the above code it breaks resolving a manifest by its digest?

Yes, this is true. Since #385, we record every manifest in index.json in order to support:

  1. Resolving a manifest by its digest
  2. Tracking untagged referrers manifests (these referrers manifests would be unreachable if they are not recorded in index.json)

So if you remove the code like that the above scenarios would be broken.

Meanwhile everything is OK with the above patch regarding tests:

Mmm that shouldn't happen🤔 Can you try again? The diff you made will fail several cases in oci_test.go.

go test ./content/oci
--- FAIL: TestStore_Success (0.00s)
    oci_test.go:144: resolver.Map() = 0, want 1
    oci_test.go:162: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_RelativeRoot_Success (0.01s)
    oci_test.go:273: resolver.Map() = 0, want 1
    oci_test.go:291: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_DisableAutoSaveIndex (0.00s)
    oci_test.go:614: resolver.Map() = 0, want 1
    oci_test.go:623: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_RepeatTag (0.00s)
    oci_test.go:703: len(resolver.Map()) = 0, want 1
    oci_test.go:706: len(index.Manifests) = 0, want 1
    oci_test.go:744: resolver.Map() = 2, want 3
    oci_test.go:747: len(index.Manifests) = 1, want 2
--- FAIL: TestStore_TagByDigest (0.00s)
    oci_test.go:881: len(resolver.Map()) = 0, want 1
    oci_test.go:884: len(index.Manifests) = 0, want 1
    oci_test.go:891: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_ExistingStore (0.01s)
    oci_test.go:1217: Store.Resolve() = {application/octet-stream sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }, want {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }
    oci_test.go:1289: Store.Predecessors(6) = [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> }], want [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> } {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(10) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(11) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(13) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
--- FAIL: TestReadOnlyStore_DirFS (0.01s)
    readonlyoci_test.go:350: ReadOnlyStore.Resolve() = {application/octet-stream sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }, want {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(6) = [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> }], want [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> } {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(10) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(11) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(13) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
FAIL
FAIL    oras.land/oras-go/v2/content/oci        0.249s
FAIL
eiffel-fl commented 8 months ago

Hi!

So, I guess by removing the above code it breaks resolving a manifest by its digest?

Yes, this is true. Since #385, we record every manifest in index.json in order to support:

1. Resolving a manifest by its digest

2. Tracking untagged referrers manifests (these referrers manifests would be unreachable if they are not recorded in `index.json`)

This makes sense! For 1., I did not take a look at the code deeply, but could we replicate what you have done to resolve layers by their digests but for manifests? Regarding 2., do we need it now what we have garbage collection?

So if you remove the code like that the above scenarios would be broken.

Meanwhile everything is OK with the above patch regarding tests:

Mmm that shouldn't happen🤔 Can you try again? The diff you made will fail several cases in oci_test.go.

go test ./content/oci
--- FAIL: TestStore_Success (0.00s)
    oci_test.go:144: resolver.Map() = 0, want 1
    oci_test.go:162: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_RelativeRoot_Success (0.01s)
    oci_test.go:273: resolver.Map() = 0, want 1
    oci_test.go:291: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_DisableAutoSaveIndex (0.00s)
    oci_test.go:614: resolver.Map() = 0, want 1
    oci_test.go:623: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_RepeatTag (0.00s)
    oci_test.go:703: len(resolver.Map()) = 0, want 1
    oci_test.go:706: len(index.Manifests) = 0, want 1
    oci_test.go:744: resolver.Map() = 2, want 3
    oci_test.go:747: len(index.Manifests) = 1, want 2
--- FAIL: TestStore_TagByDigest (0.00s)
    oci_test.go:881: len(resolver.Map()) = 0, want 1
    oci_test.go:884: len(index.Manifests) = 0, want 1
    oci_test.go:891: Store.Resolve() = {application/octet-stream sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }, want {application/vnd.oci.image.manifest.v1+json sha256:c824a9aa7d2e3471306648c6d4baa1abbcb97ff0276181ab4722ca27127cdba0 13 [] map[] [] <nil> }
--- FAIL: TestStore_ExistingStore (0.01s)
    oci_test.go:1217: Store.Resolve() = {application/octet-stream sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }, want {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }
    oci_test.go:1289: Store.Predecessors(6) = [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> }], want [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> } {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(10) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(11) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    oci_test.go:1289: Store.Predecessors(13) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
--- FAIL: TestReadOnlyStore_DirFS (0.01s)
    readonlyoci_test.go:350: ReadOnlyStore.Resolve() = {application/octet-stream sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }, want {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(6) = [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> }], want [{application/vnd.oci.image.index.v1+json sha256:87a9ec2f8d3a785cdd39d52df3190bc989be8c526050a283636f3c5304b2ddc0 186 [] map[] [] <nil> } {application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(10) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(11) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:063351176d525cd595e4741dbf2cbed7eb449534c7afc6ad9ae926da4ef24660 354 [] map[] [] <nil> }]
    readonlyoci_test.go:413: ReadOnlyStore.Predecessors(13) = [], want [{application/vnd.oci.artifact.manifest.v1+json sha256:232a29e9b955c6cb5b3424f50d2d84bb4f5cabffffefb4e82a491a03747c4c98 351 [] map[] [] <nil> }]
FAIL
FAIL    oras.land/oras-go/v2/content/oci        0.249s
FAIL

Indeed, I got mixed up and I guess I tested without the patch applied :sweat_smile::

root@c808884f4989:/mnt# make test
...
ok      oras.land/oras-go/v2/registry/remote/retry      9.786s  coverage: 78.5% of statements
FAIL
make: *** [Makefile:16: test] Error 1

Best regards.

eiffel-fl commented 8 months ago

I tweaked a bit the code base to be able to reference manifest by their digest, the code can be found here: https://github.com/eiffel-fl/oras-go/commits/francis/no-manifests-index/ I am sadly still getting troubles with tests on Predecessors() as, for the moment, I do not understand the link between resolver.Tag() and graph.Predecessors(). I will continue to dig and follow-up with my findings here.

eiffel-fl commented 8 months ago

Hi!

I continued my investigation and now I understand why you need to store the manifests in index.json.

I will take the example of TestReadOnlyStore_DirFS. At first, everything lives in memory and you creates the graph as the same time as you Push(), since Push() calls Index() which creates the graph. If you tweak the tag(), which is called by Push() for manifest, to remove writing manifests to index.json, with, for example, the following patch:

diff --git a/content/oci/oci.go b/content/oci/oci.go
index 997543c..c737796 100644
--- a/content/oci/oci.go
+++ b/content/oci/oci.go
@@ -429,13 +432,6 @@ func (s *Store) saveIndex() error {
                        tagged.Add(desc.Digest)
                }
        }
-       // 2. Add descriptors that are not associated with any tag
-       for ref, desc := range refMap {
-               if ref == desc.Digest.String() && !tagged.Contains(desc.Digest) {
-                       // skip tagged ones since they have been added in step 1
-                       manifests = append(manifests, deleteAnnotationRefName(desc))
-               }
-       }

        s.index.Manifests = manifests
        return s.writeIndexFile()

You do not have troubles, as long as everything resides in memory. But, in this test, we later call NewFromFS() which reads index.json through loadIndexFile() to create the graph. Sadly, some links are missing as manifests are not written in index.json, so we cannot Fetch() them later in the code.

So, out of the blue, I think we need to explore the filesystem to create all the links. Indeed, if we have this index.json:

$ sudo cat /var/lib/ig/oci-store/index.json | jq            francis/no-manifests-index *% u+2-2
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    }
  ]
}

We can find its associated manifests with:

$ sudo cat /var/lib/ig/oci-store/blobs/sha256/b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1 | jq
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:662ca58536d5a03864781eb42870f6712927981383f1dab1449d9adf80067313",
      "size": 581,
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb",
      "size": 581,
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    }
  ]
}

And so on, we can get the layers with:

$ sudo cat /var/lib/ig/oci-store/blobs/sha256/f61eef5ff800b2e4366c6d2e7311e95ce8dc3cff5c020a3a810429656fd93efb | jq
{
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.gadget.config.v1+yaml",
    "digest": "sha256:b4be31f07444fb9be6dee8cd1349aab3c780d333b63406be10c5c14c1b39dd37",
    "size": 1853,
    "annotations": {
      "org.opencontainers.image.title": "config.yaml"
    }
  },
  "layers": [
    {
      "mediaType": "application/vnd.gadget.ebpf.program.v1+binary",
      "digest": "sha256:c06a86dad7e582686b53bec9c729b92fc56c838b59cc11d0105c2c20f3158dcd",
      "size": 125232,
      "annotations": {
        "org.opencontainers.image.authors": "TODO: authors",
        "org.opencontainers.image.description": "TODO: description",
        "org.opencontainers.image.title": "program.o"
      }
    }
  ]
}

First of all, is my understanding correct? Second, what do you think of the proposal to loop through the filesystem? I sadly do not see other solutions but I am a bit afraid of the I/O cost. Anyway, we are already going through the filesystem when we use NewFromFS(), so maybe we can bear with it.

Best regards.

eiffel-fl commented 8 months ago

OK, I took another look and my conclusion were not really accurate. We really need to have manifests written in index.json, if they do not have predecessors. Indeed, without predecessors, if they are not present in index.json, the current code cannot find them. So, we can go two ways:

  1. We can only write manifests in index.json if they do not have a tagged predecessor. This is what I do in this commit https://github.com/oras-project/oras-go/commit/732d87d8d1a6f5a7f385699e511c97636d901222:
$ sudo cat /var/lib/ig/oci-store/index.json | jq         francis/fsslower-statfs *% u=
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:b0eff7f0ec2a27b8d71fe904cf373fa2002c6d669db54c25a9542cf2cb5f46d1",
      "size": 491,
      "annotations": {
        "org.opencontainers.image.ref.name": "docker.io/library/execs:latest"
      }
    }
  ]
}

On the other hand, manifests which do not have predecessors, like https://github.com/oras-project/oras-go/blob/d1becd506c82062a04c184a88a64cb5ac26c0e9b/content/oci/readonlyoci_test.go#L291, are still written, so everything is still correct.

  1. We can go even further and remove the writing of all manifests to index.json. To achieve this, we would need to go through all files under /blobs checking if they are manifests. If there is a manifest which is currently not in the graph, because it has no predecessors, we would need to add it to the graph as well as all its successors.

What do you think? Does it sound reasonable for you to first go with 1. in a dedicated PR and then go further with 2.?

Wwwsylvia commented 8 months ago

Hi @eiffel-fl ! The way we handle index.json has a fundamental impact on the entire OCI store, so we need more time to carefully consider the design. We can start with some brainstorming here.

Basically, we need to ensure that:

Additionally, it is not necessary to record nodes in index.json that:

eiffel-fl commented 7 months ago

Hi!

Hi @eiffel-fl ! The way we handle index.json has a fundamental impact on the entire OCI store, so we need more time to carefully consider the design. We can start with some brainstorming here.

Indeed, more discussion are needed as this problem is not trivial to solve.

Basically, we need to ensure that:

* It is possible to discover untagged referrers (since people usually won't tag referrers)

I do not have a patch for this at the moment and I will need to think more about the consequences.

* Manifests can be resolved by either their tags or digests

To resolve manifests with their digest, I wrote this patch: https://github.com/oras-project/oras-go/commit/1c016778cd7a3ad6c3d20965c5eda373c0e15db9 At the moment, I do not use this patch in another one and I did not add test, so I cannot guarantee it works.

Additionally, it is not necessary to record nodes in index.json that:

* Can be traversed from the roots (i.e., nodes in `index.json`)

* Are not tagged

Indeed, if there is a path to a manifest, we do not need to store it in index.json, this is achieved by this patch: https://github.com/oras-project/oras-go/commit/732d87d8d1a6f5a7f385699e511c97636d901222

Best regards.

Wwwsylvia commented 7 months ago

Hi @eiffel-fl , after an offline discussion, we realized that we still want all manifests to be recorded in index.json. The roles of index.json are:

The idea is that every manifest is a root of an artifact, so it should be in the list of roots (index.json), no matter it is tagged or not.

Since there is no change in the design of index.json, we can still ensure:

To resolve your original concern:

This behavior can create troubles, as when the manifests in index.json will not be automatically deleted when the surrounding index is deleted:

We re-consider the definition of garbage produced by Delete(): The untagged successor manifests are also garbage. @wangxiaoxuan273 has sent a PR #680 for this change , can you check if that works for you?

eiffel-fl commented 7 months ago

Hi!

Hi @eiffel-fl , after an offline discussion, we realized that we still want all manifests to be recorded in index.json. The roles of index.json are:

* To serve as the entry point of the graph, as defined in the [spec](https://github.com/opencontainers/image-spec/blob/v1.1.0-rc5/image-layout.md#indexjson-file)

* To function as a tagging service,as defined in the [spec](https://github.com/opencontainers/image-spec/blob/v1.1.0-rc5/image-layout.md#indexjson-file)

* To maintain a list of manifests, where the manifests may or may not have tags

The idea is that every manifest is a root of an artifact, so it should be in the list of roots (index.json), no matter it is tagged or not.

With regard to the spec, this definitely makes sense!

Since there is no change in the design of index.json, we can still ensure:

* It is possible to discover untagged referrers (since people usually won't tag referrers)

* Manifests can be resolved by either their tags or digests

To resolve your original concern:

This behavior can create troubles, as when the manifests in index.json will not be automatically deleted when the surrounding index is deleted:

We re-consider the definition of garbage produced by Delete(): The untagged successor manifests are also garbage. @wangxiaoxuan273 has sent a PR #680 for this change , can you check if that works for you?

I will for sure take a look at it :eyes:!

Best regards.

eiffel-fl commented 7 months ago

Closing this issue as:

  1. Manifests have to be present in index.json as they are the root of an artifact.
  2. On deletion, untagged manifests will be removed by #680.

With regard to this:

Manifests can be resolved by either their tags or digests

I wrote this commit: https://github.com/eiffel-fl/oras-go/commit/1c016778cd7a3ad6c3d20965c5eda373c0e15db9 Can you please share your insights on it? If it is useful for you, I will just open a PR.

Wwwsylvia commented 7 months ago

Hi @eiffel-fl , I'm wondering why do you need this change? Using the current implantation, you should be able to resolve a manifest with the correct media type by its digest.

eiffel-fl commented 7 months ago

Hi!

Hi @eiffel-fl , I'm wondering why do you need this change? Using the current implantation, you should be able to resolve a manifest with the correct media type by its digest.

I personally do not need this code, all I needed was already perfectly merged in #680. I will take a deeper look at the code to see which part already resolve manifest by digest, I should have missed it.

Best regards.

Wwwsylvia commented 7 months ago

@eiffel-fl See https://github.com/oras-project/oras-go/blob/66ea3dfe71ad48ea573f337af5c3f1251cacbe49/content/oci/oci.go#L282-L294

eiffel-fl commented 7 months ago

@eiffel-fl See

https://github.com/oras-project/oras-go/blob/66ea3dfe71ad48ea573f337af5c3f1251cacbe49/content/oci/oci.go#L282-L294

Indeed! Thank you for sharing :D!