kontena / pharos-cluster

Pharos - The Kubernetes Distribution
https://k8spharos.dev/
Apache License 2.0
312 stars 40 forks source link

Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1 #939

Closed jakolehm closed 5 years ago

jakolehm commented 5 years ago

What happened:

cert-manager addon fails to install.

==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 1 seconds ...
==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 2 seconds ...
==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 4 seconds ...
==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 8 seconds ...
==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 16 seconds ...
==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 32 seconds ...
==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 64 seconds ...
==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 128 seconds ...
==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 256 seconds ...
==> Enabling addon cert-manager
    got error (K8s::Error::UndefinedResource): Unknown resource kind=Issuer for certmanager.k8s.io/v1alpha1
    retrying after 512 seconds ...

Pharos tries to retry but it does not help... probably retry should clear internal api cache?

jakolehm commented 5 years ago

Btw this happens rarely so most likely it's some kind of race condition.

kke commented 5 years ago

The k8s-client memoizes the api resources

#api_client.rb

    # Force-update APIResources
    #
    # @return [Array<K8s::API::MetaV1::APIResource>]
    def api_resources!
      @api_resources = @transport.get(path, response_class: K8s::API::MetaV1::APIResourceList).resources
    end

    # Cached APIResources
    #
    # @return [Array<K8s::API::MetaV1::APIResource>]
    def api_resources
      @api_resources || api_resources!
    end

    # @param resource [K8s::Resource]
    # @param namespace [String, nil] default if resource is missing namespace
    # @raise [K8s::Error::NotFound] API Group does not exist
    # @raise [K8s::Error::UndefinedResource]
    # @return [K8s::ResourceClient]
    def client_for_resource(resource, namespace: nil)
      found_resource = api_resources.find{ |api_resource| api_resource.kind == resource.kind }
      raise K8s::Error::UndefinedResource, "Unknown resource kind=#{resource.kind} for #{@api_version}" unless found_resource

      ResourceClient.new(@transport, self, found_resource, namespace: resource.metadata.namespace || namespace)
    end

Also the api-clients are memoized:

#client.rb

    # @param api_version [String] "group/version" or "version" (core)
    # @return [APIClient]
    def api(api_version = 'v1')
      @api_clients[api_version] ||= APIClient.new(@transport, api_version)
    end

So, any new kinds created during the client life-time will not be accessible.

There's no method to clear the api_client cache.

I think the k8s-client itself should maybe clear the caches when raising those, then the user could rescue + retry without having to know the internals.

jakolehm commented 5 years ago

So, any new kinds created during the client life-time will not be accessible.

It will fetch new apis (kinds) if they are not in the cache. Not sure what happens when K8s::Error::UndefinedResource is raised.

kke commented 5 years ago

I don't think client.api('v1').client_for_resource('x') can currently ever clear the api_clients['v1'].api_resources[] and will raise UndefinedResource forever even if the resource becomes available.

jakolehm commented 5 years ago

client.api('v1').api_resources! should force-update api resources, right?

kke commented 5 years ago

True (but it won't be called as long as @api_resources is not nil)

kke commented 5 years ago

Or client.api('v1').api_resources = nil will make the next attempt repopulate the cache.

jnummelin commented 5 years ago

closed in #943