vsoch / oci-python

Python implementation of Open Containers Initiative (OCI) specifications
https://vsoch.github.io/oci-python/
Mozilla Public License 2.0
23 stars 12 forks source link

experiment(MediaType): Validate MediaType for Image Spec #3

Closed semoac closed 4 years ago

semoac commented 4 years ago

PLEASE DO NOT MERGE THIS Hi 😃 !

As defined in here, config and layers needs to validate the MediaType on the Descriptor.

Something like this could also be true for annotations, index, etc.

I know the repo is still WIP but I would like to point out this situation with the current suggestion.

Implementing something like this could easily allow the inclusion of more artifacts (specs) like helm, Singularity or cnab on a future contribs module. This basic validation could also help any future projects working on implement a distribution registry.

Signed-off-by: Sergio Morales sergio@cornershopapp.com

vsoch commented 4 years ago

hey! So I've barely gotten started with this project - you can expect it to be imperfect for a bit! It's been all of... 24 hours? Please be patient :)

vsoch commented 4 years ago

But yes! Thank you for pointing this out - I’ll read over the spec carefully and circle back with you when I’m further along. Sound good?

semoac commented 4 years ago

But yes! Thank you for pointing this out - I’ll read over the spec carefully and circle back with you when I’m further along. Sound good?

Sound fantastic!. Thank for working on this. I was no trying to hurry you up or anything, just pointing out a possible use case for this implementation :).

Once again, thank you!

vsoch commented 4 years ago

heyo! Just wanted to let you know I just implemented the basics for this to work, basically the Manifest function has a validation function (_validate()) that's run after the superclass validate, and then within it we can check that the config has a Config mediaType, and the layers have a mediaType for one that is for a layer. E.g.,

    def _validate(self):
        '''custom validation function to ensure that Config and Layers mediaTypes
           are valid. By the time we get here, we know there is a Config object,
           and there can be one or more layers.
        '''
        # These are valid mediaTypes for layers
        layerMediaTypes = [MediaTypeImageLayer,
                           MediaTypeImageLayerGzip,
                           MediaTypeImageLayerZstd,
                           MediaTypeImageLayerNonDistributable,
                           MediaTypeImageLayerNonDistributableGzip,
                           MediaTypeImageLayerNonDistributableZstd]

        # The media type of the config must be for the config
        mediaType = self.attrs.get("Config").value.attrs.get("MediaType").value
        if mediaType != MediaTypeImageConfig:
            bot.error("config mediaType %s is invalid, should be %s" %(mediaType, MediaTypeImageConfig))
            return False

        # Check against valid mediaType Layers
        for layer in self.attrs.get("Layers").value:
            mediaType = layer.attrs.get('MediaType').value
            if mediaType not in layerMediaTypes: 
                bot.error("layer mediaType %s is invalid" % mediaType)
                return False

        return True

I'm trying to mirror the GoLang as much as possible so I didn't add in a bunch of new functions, it should just work to do like:

from opencontainers.image.v1 import Manifest
manifest=Manifest()

# notice that the mediaType is invalid!
invalid_mediatype_pattern = {
  "schemaVersion": 2,
  "config": {
    "mediaType": "invalid",
    "size": 1470,
    "digest": "sha256:c86f7763873b6c0aae22d963bab59b4f5debbed6685761b5951584f6efb0633b"
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "size": 148,
      "digest": "sha256:c57089565e894899735d458f0fd4bb17a0f1e0df8d72da392b85c9b35ee777cd"
    }
  ]
}

# load will validate the basic types (e.g., mediaType is a string) 

# and then validate will do the validation above to determine it's the wrong type
manifest.load(invalid_mediatype_pattern)
manifest.validate() # returns False
ERROR config mediaType invalid is invalid, should be application/vnd.oci.image.config.v1+json

Note that attributes that are strings can have regular expressions to check against (as I do with the list of Env to ensure in the format = but in the case of a Descriptor it can be for multiple kinds of things, so we can't use that here.

I haven't pushed any of this new code yet, still writing tests for the manifest (and quite a bit for digests too... such is the WIP!)

vsoch commented 4 years ago

And I can separate those two validations (config and layers) into separate (private) functions if you think they would be called separately.

vsoch commented 4 years ago

This will handle empty structures too

    def _validate(self):
        '''custom validation function to ensure that Config and Layers mediaTypes
           are valid. By the time we get here, we know there is a Config object,
           and there can be one or more layers.
        '''
        if not self._validateLayerMediaTypes or not self._validateConfigMediaType:
            return False
        return True

    def _validateConfigMediaType(self):
        '''validate the config media type.
        '''
        # The media type of the config must be for the config
        manifestConfig = self.attrs.get("Config").value

        # Missing config is not valid
        if not manifestConfig:
            return False

        mediaType = manifestConfig.value.attrs.get("MediaType").value
        if not mediaType:
            return False

        if mediaType != MediaTypeImageConfig:
            bot.error("config mediaType %s is invalid, should be %s" %(mediaType, MediaTypeImageConfig))
            return False
        return True

    def _validateLayerMediaType(self):
        '''validate the Layer Media Types
        '''
        # These are valid mediaTypes for layers
        layerMediaTypes = [MediaTypeImageLayer,
                           MediaTypeImageLayerGzip,
                           MediaTypeImageLayerZstd,
                           MediaTypeImageLayerNonDistributable,
                           MediaTypeImageLayerNonDistributableGzip,
                           MediaTypeImageLayerNonDistributableZstd]

        # No layers, not valid
        layers = self.attrs.get("Layers").value
        if not layers:
            return False

        # Check against valid mediaType Layers
        for layer in layers:
            mediaType = layer.attrs.get('MediaType').value
            if mediaType not in layerMediaTypes: 
                bot.error("layer mediaType %s is invalid" % mediaType)
                return False

        return True
vsoch commented 4 years ago

@semoac I've finished the validation of Layers and Config (just pushed) let me know what other mediaTypes you'd like to see validated. We can transform this into an issue!

And I still have lots of work to do, but am slowly making progress! I'll update the docs as I go.

Closing up shop for today though, silly time change means ti gets dark SO much earlier and I still haven't run :P

semoac commented 4 years ago

Thank you for the great work!

vsoch commented 4 years ago

I'm almost done! I'm adding the CI for tests now, and need to finish writing documentation for each. I'll definitely post to the list (likely later today?) and we can pick up with what would be fun to do next :)