[SPIKE] - Research on Istio installations in different scenarios (modules)

mkyc commented 4 years ago

Describe the solution you'd like We want Istio as a separate module for Epiphany. This issue is about doing a reasearch on how we can implement it.

Additional context Proposed module name: m-istio We would need to test two scenarios here:

installation on managed kubernetes cluster using cloud provided public IP mechanisms
installation behind HAProxy and Istio using node port

This can be useful: https://istio.io/latest/docs/tasks/traffic-management/ingress/ingress-control/#determining-the-ingress-ip-and-ports

mkyc commented 4 years ago

IMHO we should test at least two scenarios when planning Istio module:

installation on managed kubernetes cluster using cloud provided public IP mechanisms
installation behind HAProxy and Istio using node port

rpudlowski93 commented 3 years ago

Summary of my spike:

Istio - Open source platform which allow us run service mesh for distributed microservice architecture. It allows to connect, manage and run secure connections between microservices and bring us a lot of features such as load balancing, monitoring, service-to-service authentication without any changes in service code. Read more about Istio here.

We can install Istio using a few, different approaches:

Helm - still in alpha state
Istioctl
Istio Operator

For testing purposes I decided to use installation using Istio Operator. Operator is software extensions to kubernetes which have a deep knowledge how to Istio deployments should looks and how to react if any problem appears. It is also very easy to make any upgrade and automate a tasks which user/admin normally should execute.

As it was mentioned by @mkyc two scenarios were tested using Istio operator and booking application:

1) Installation on managed kubernetes cluster (Azure - AKS) using cloud provided public IP mechanisms Generally, istio operator install ingress gateway which works as loadbalancer service and in case of AKS where Azure create us by default in worker resource group load balancer, the ingress gateway obtains one external public IP which we can use for accessing to our application in service mesh. The ingress gateway works in "Layer 7" thanks to Virtual Service and Gateway configuration which we need to bound to the ingress gateway.

Thanks to Virtual Service, Gateway and DestinationRules we can route all the traffic depend on use-case based on % (for example 60% traffic to Service-v1 and 40% to Service-v2), use A/B testing, Canary Release etc...

What was done?

Istio installed using Istio Operator on Azure AKS from AzKS module
Virtual Services and Gateway configured
Application with some microservices deployed

Result?

Application is available from browser/internet using Public IP of Azure LoadBalancer:

2) Installation behind HAProxy and Istio using node port - case for "unmanaged" cluster Here cluster was deployed in our standard epiphany approach in the following configuration: 1 master, 2 worker nodes, 1 loadbalancer. Istio was installed using Istio Operator as well and the steps were exactly the same as in previous case. In compare to AKS, Istio Ingress Gateway DID NOT get external public IP. In this case the ingress gateway service still works but we can access the application in service mesh using node ports or Public IP of any external Loadbanalcer. In epiphany we use HaProxy where all node ports should added to backend.

What was done?

Istio installed using Istio Operator on "standard" epiphany cluster
Virtual Services and Gateway configured
HaProxy with Public IP configured, node ports added to backend in order pass the traffic from browser to service mesh.

Result?

Application is available from browser/internet using Public IP of loadbalancer - HaProxy.

How to install istio step by step using Istio Operator: link

I wouldn't recommend to install Istio using terraform since terraform is the idea of technology for more infrastructure.Moreover istio was dropping helm support in favor of istioctl and finally installing by Helm became unsupported. Last time it turned out when Istio was released in 1.8 version but it is still unrecommended way. Helm doesn't support the canary upgrade model which is the recommended flow nowadays.

We can deploy Istio in different models, as far as HA is concerned:

single or multiple cluster
single or multiple network
single or multiple control plane
single or multiple mesh More here: link

Finally, in my opinion, we should create new m-istio module, which could be use on the top of epiphany cluster, and using kubectl and istioctl we should deploy the Istio.

It is possible to install Istio using only kubectl but it is not recommended way, more here: link

rpudlowski93 commented 3 years ago

We should also check how it works using internal loadbalancer since teams are going to use the service mesh in private cluster.
We should check how upgrade proces looks and how it works ( specially how envoy proxy will behave )

mkyc commented 3 years ago

Another topic to check is offline installation, but that what I read looks good enough to start designing requirements for module.

rpudlowski93 commented 3 years ago

Update: I executed Istio upgrade using Istio Operator from 1.7.5 to 1.8.0:

We can execute the upgrade in two ways: canary update and in-place update. I chose the in-place update. It is really easy to do it using operator since operator will upgrade control plane automatically. We just need to execute:
- istioctl operator init (in new, target istioctl version)
- kubectl with rolling update of our application's deployment in order to update envoy proxy (sidecar container in every microservice). It is "zero-downtime" update.

Nice to know about performance:

The Envoy proxy uses 0.35 vCPU and 40 MB memory per 1000 requests per second going through the proxy.
Istiod (control plane) uses 1 vCPU and 1.5 GB of memory.
The Envoy proxy adds 2.65 ms to the 90th percentile latency.

erzetpe commented 3 years ago

Both scenarios look okay in my opinion. We should this what Mateusz mentioned.

seriva commented 3 years ago

Looks good to me, I would agree with: Finally, in my opinion, we should create new m-istio module, which could be use on the top of epiphany cluster, and using kubectl and istioctl we should deploy the Istio. Dont think we should waste time implementing this in old Epiphany.

As for offline we need to check how to achieve this as it will be important.

to-bar commented 3 years ago

I think we should take into account offline mode.

Identify all needed binaries (requirements).
Decide how to provide them, some options to consider: a) build into m-istio module's Docker image - need to check sizes and licenses b) add to epirepo created by epicli - by default the service is stopped after epicli completes but loaded images are available in Docker registry on repository host c) use epicli prepare to download requirements and mount them to module's container at runtime d) write m-repository module first e) implement Istio in classic Epiphany first, then migrate (reusing code) to a module
Decide if we need/want Ansible to deploy Istio. If yes, then maybe we should think about using a Docker image with already installed Ansible (e.g. https://github.com/willhallonline/docker-ansible) as a base image for modules that will use Ansible.

mkyc commented 3 years ago

Decide if we need/want Ansible to deploy Istio. If yes, then maybe we should think about using a Docker image with already installed Ansible (e.g. https://github.com/willhallonline/docker-ansible) as a base image for modules that will use Ansible.

Please remember that @seriva did some Ansible image already. My opinion is that it's bit of overhead to use Ansible here (we just need simple retry of command) but if we want to go with Ansible it's definitely too early do decide what we want to extract to some base image. Also please remember that for this topic - docker images structure - there is #1478 task defined.

sk4zuzu commented 3 years ago

d) write m-repository module first

I'm afraid deciding what should go into the module and how it should be designed that's difficult question to answer. To make local/private docker registry usable in Kubernetes we not only need to provide credentials but also proper certificates/domain or set insecure registry in docker. We most likely shouldn't be using insecure registiries (especially in managed k8s).

There are 2 cases I can imagine:

Air-gapped install in managed k8s would be easier to do with proper cloud docker-registry like (gcr).
Air-gapped install on-prem would require setting insecure registiry (or manage our own PKI infrastructure) in docker (we don't provide full autoscaling functionality for on-prem clusters).

So in fact we need 2 such modules to cover both cases. :thinking:

rpudlowski93 commented 3 years ago

Looking briefly at the possibilites, offline installation in my opinion can be possible. We can get all artefacts from:

https://gcsweb.istio.io/gcs/istio-release/releases/1.8.1/docker/

We can use: istioctl operator init with flag --hub which will point to the local registry.

All possible flags : link

mkyc commented 3 years ago

I'm afraid deciding what should go into the module and how it should be designed that's difficult question to answer.

I agree here with @sk4zuzu.

Air-gapped install in managed k8s would be easier to do with proper cloud docker-registry like (gcr).

I believe we could omit that scenario in first version. @toszo can you comment?

Air-gapped install on-prem would require setting insecure registiry (or manage our own PKI infrastructure) in docker (we don't provide full autoscaling functionality for on-prem clusters).

Yeah, and I believe it's first "offline" scenario we want to hit. Again, not sure if required in first version (@toszo ?). If we imagine it as separate m-istio-registry it becomes:

relatively small
similar to other m-*-registry modules in future

atsikham commented 3 years ago

Agree with @mkyc that using Ansible for Istio module might be an overhead. Managing own PKI infrastructure for the internal registry looks to be not the most important thing for the first time, but should be taken into account. @sk4zuzu is it related to #1764 and could be re-used from that task?

to-bar commented 3 years ago

About using Ansible - IMO it depends on flow and the flow seems to be not decided yet. I would almost always prefer "smart" Ansible than "limited" Bash, especially when network operations are needed (more than one host).

Each component can have its own repository/registry module (or embedded logic to create/update it) or we can have m-repository generic module that is used by other modules to provide them what they want. For example, m-istio module "requests" a list of artifacts it needs and other modules also "request" their deps. Then m-repository module prepares requirements.txt for all modules that are configured (required) to do something during deployment. Then m-repository module (or m-downloader) downloads all requirements (Internet access is needed here). Then we have all needed binaries. They need to be copied to repository host (or m-downloader could be run directly on repository VM for online case) and served as http server (OS packages & binary files) and as image registry (Docker images). Then m-istio could done its job using files from repository host.

mkyc commented 3 years ago

I believe this task is done correctly and could be moved to DoD phase.

seriva commented 3 years ago

Agreed.

hitachienergy / epiphany

[SPIKE] - Research on Istio installations in different scenarios (modules) #1821