DataONEorg / api-entrypoint

The DataONE Kubernetes cluster's API ingress controller component that is shared between microservices
0 stars 1 forks source link

Make 'external' services available to k8s #5

Open gothub opened 3 years ago

gothub commented 3 years ago

@csjx relevant to the recent conversation regarding api.dataone.org and k8s:

Services running on k8s may need to access services running outside the k8s cluster.

For example, the quality service needs to access the NFS Awards database (i.e. https://api.nsf.gov/services/v1). Currently this is accessed by hardcoding the URL in the quality server source code.

The k8s infrastructure supports a 'standard' way to make external services available to internal k8s services. The k8s service definition supports service type 'externalName', which is essentially causes a CNAME redirect to a DNS name defined in the service. For example, an NSF awards service definition could be defined:

apiVersion: v1
kind: Service
metadata:
  name: nsf
  namespace: metadig
spec:
  type: ExternalName
  externalName: https://api.nsf.gov

Internal services could then access this service using a domain name that is dynamically resolved using the internal k8s DNS server: http://nsf.metadig.svc.cluster.local/services/v1.

The main reason for using this type of service is so that if the external service URL changes, only the k8s service definition needs to change, and not source code that has the URL hard-coded.

The only problem with this mechanism is that the NGINXInc Ingress Controller only supports 'externalName' services with the NGINX Plus version (https://github.com/nginxinc/kubernetes-ingress/pull/485)

So, if this is something that should be added to the NCEAS k8s, then the community version of the NGINX Ingress Controller must be used (i.e. https://kubernetes.github.io/ingress-nginx/)

mbjones commented 3 years ago

@gothub interesting, thanks for the summary. Wouldn't it also be possible to have a simple configuration for the external URLs, rather than hardcoding them into code? That way, when a k8s service like metadig comes up, it can read its configuration to determine where to look for external services? That gives the flexibility to make changes without changing source code, but keeps things a bit simpler I think too. I suppose the config could be application-specific (e.g., read in a known properties file like Metacat does), or provided through the kubernetes configuration layer (e.g., https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/). Thoughts?

gothub commented 3 years ago

@mbjones both the quality engine and bookkeeper read a properties file at startup that contains definitions for services outside of k8s, for example

Using the k8s externalName service type defines a known name that is the same for all services (the service name), so it is an interesting option to config files. Config files are working fine for now, but it's worthwhile considering other options as more services are added to k8s.

mbjones commented 3 years ago

Makes sense and I figured it was configurable, but that is at image creation time, right? Or are you putting the config files on a mounted volume and reading them at service start? The reason I proposed a configmap is that it is a first class citizen in the k8s world, externalizes the config to outside of the image, and is accessible to multiple pods and services in the k8s cluster without a volume mount. So I think it has many of the features of the externalName without the problem of needing nginxplus support.

gothub commented 3 years ago

Both metadig-engine and bookkeeper read a config file when the k8s service starts, so are dependent on an external mount.

Yep, configmaps would work.

mbjones commented 3 years ago

OK, I guess I shouldn't have taken your earlier statement about hardcoding literally (" Currently this is accessed by hardcoding the URL in the quality server source code."). Given that the URL is configurable, and is not hardcoded into the image at all, then I don't see any need for immediate changes here at all. I could see a shift to configmaps being good for flexible deploys without volume mounts, but the volume mounts work fine too. In any case, I don't see a need that would cause us to shift to using k8s externalName -- configMaps would be easier when it comes to that.