proposal: New architecture of Apache APISIX Ingress controller

tao12345666333 commented 3 years ago

In the current architecture of the Apache APISIX Ingress controller, we use the Apache APISIX Ingress controller as a control plane component. The user creates a specified type of CR in Kubernetes, and the Apache APISIX Ingress controller converts it into a data structure that can be received by Apache APISIX, and creates, modifies or deletes it by calling the admin API. Such an architecture has the following advantages:

The separation of CP and DP can ensure that even if the CP component is abnormal, DP can still run properly;
Users can deploy DP in any location they like, including outside the Kubernetes cluster

But such an architecture will also have its disadvantages Users need to maintain a complete Apache APISIX cluster, which cannot be done simply by modifying the replicas field of the Apache APISIX Ingress controller

I hope to introduce an architecture similar to ingress-nginx, which is widely used in Kubernetes.

In this way, users can complete the deployment directly through a Pod. At the same time, user can simply modify the replicas parameter to complete the scale.

sync from mail list: https://lists.apache.org/thread.html/r929a6dfa9620d96874056750c6b07b8139b4952c8f168670553dfb86%40%3Cdev.apisix.apache.org%3E

gxthrj commented 3 years ago

Agree +1

juzhiyuan commented 3 years ago

+1

tokers commented 3 years ago

+1

github-actions[bot] commented 2 years ago

This issue has been marked as stale due to 90 days of inactivity. It will be closed in 30 days if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@apisix.apache.org list. Thank you for your contributions.

tao12345666333 commented 2 years ago

In the simplest terms, to make it easier to scale and manage, etcd must be removed. So there is a high probability that we will no longer use the Admin API

I guess it will be

APISIX Ingress (gRPC server)  —>  gRPC client
                                      ｜
                                       -> APISIX standalone mode

APISIX may become a child process managed by another component we implement.

tokers commented 2 years ago

What about using an ETCD adapter to let the custom component support ETCD APIs so that we can avoid any changes for APISIX.

tao12345666333 commented 2 years ago

APISIX standalone mode will fully update the configuration, which will have some impact on health checks or caching.

sober-wang commented 2 years ago

In the simplest terms, to make it easier to scale and manage, etcd must be removed. So there is a high probability that we will no longer use the Admin API

I guess it will be
APISIX Ingress (gRPC server)  —>  gRPC client
                                      ｜
                                       -> APISIX standalone mode
APISIX may become a child process managed by another component we implement.

Why use gRPC ? gRPC is more complex than rest apt. we should definition more and more protbuf file and increase the code complexity.

I recommend use default APISIX admin api.

tao12345666333 commented 2 years ago

In the simplest terms, to make it easier to scale and manage, etcd must be removed. So there is a high probability that we will no longer use the Admin API

If there is no storage component, then we will drop the Admin API @sober-wang

Using gRPC allows for active push configuration via the server. Even introducing xDS here is an option

gRPC is more complex than rest apt. we should definition more and more protbuf file and increase the code complexity. It is normal for new features to introduce some code changes.

While the current mode is really simple, obviously I want it to be more powerful. I won't stop it for fear of needing to write code or adding complexity

macmiranda commented 1 year ago

I'm new to APISIX so I started reviewing the Architecture and the Deployment modes, alongside the documentation of the Ingress Controller itself (since my goal is to manage APISIX via Kubernetes CRDs and use it as an alternate Ingress Class for K8s Ingresses)

If my understanding is correct, the APISIX Ingress Controller makes the APISIX Control Plane (and the Admin API for that matter) almost entirely obsolete (at least in the context being discussed here).

If the actual configuration of APISIX is now done via Custom Resources (which are ultimately persisted in the etcd of the cluster itself) why would we need another etcd cluster to persist configuration data? I'm guessing that the current architecture of the Ingress Controller was meant to be a functional adapter for the APISIX Admin API without having to introduce any significant changes to the latter whilst still making it possible to use APISIX in the context of a Kubernetes Cluster (and its resources).

While it does what it's meant to do, it also introduces certain problems, some of which deserve serious consideration:

the etcd cluster installed with APISIX in the Traditional mode does not get backed up through regular cluster backups
in order to be able to talk to the Admin API and configure routes, upstreams, etc. the Ingress Controller needs to store the admin user key in a Config Map which is essentially insecure and makes it harder for credentials to be rotated. Setting those keys in Helm values also isn't a good solution since we store all the values in our SCM. The best would be if we could get those values from Vault. [edit] Just found out about this issue which goes in line with what I said here [/edit]
the Ingress Controllers' IP addresses need to be hard-coded in the APISIX Control Plane configuration if we want to take advantage of the admin.allow.ipList. It can be quite challenging to get those values when installing both charts via Helm (without hard-coding them and they are essentially dynamic) so you end up having to allow the whole Cluster Network range or 0.0.0.0/0
maintaining an extra Control Plane and an extra etcd cluster means more overhead for the Platform teams and introduces more points of failure. [edit] As an illustration of this, check how many of the apisix-helm-chart repo's issues are related to etcd alone (37 of 80) [/edit]

It would be great if the Ingress Controller could talk directly to the Data Plane in standalone mode.

tao12345666333 commented 1 year ago

You are right!

That's the main reason why I came up with this idea.

This will be my third priority, I will deal with #1465 first and then release v1.6. Then I will start working on this one.

It won't be long before I post my thoughts 💡 here to discuss with you all

tao12345666333 commented 1 year ago

2023-01-18 09-56-25屏幕截图

I have a new idea.

Since APISIX v3 has added the capability of gRPC-client, some optimizations have been made to the CP/DP deployment model in APISIX v3. So we can apply this model in APISIX Ingress, implement a gRPC server similar to ETCD in APISIX-Ingress-Controller, let it serve as the control plane, and APISIX, which is actually a data plane, connects to it through gRPC.

In this way, the data plane APISIX is exactly the same as normal APISIX, not in Standalone mode, so you can use all the capabilities of APISIX without any modification to APISIX.

WDYT?

macmiranda commented 1 year ago

Sounds good to me though I'm not the most familiar with the APISIX architecture, specially not when it comes to the gRPC components. One question though, would the Ingress controller also need some type of state store or would it work fine with just reading the state from the kubernetes resources? Also not sure how the client would authenticate to the server. Would mTLS be an option?

tao12345666333 commented 1 year ago

In the new architecture, the ingress controller is a stateless component.

It can just read and store resource status in Kubernetes's resources.

For authentication, we can add certificates to protect the connection.

caibirdme commented 1 year ago

Is it possible that controller just modify the apisix.yaml configmap thus other standalone apisix instances can watch these changes

tao12345666333 commented 1 year ago

Is it possible that controller just modify the apisix.yaml configmap thus other standalone apisix instances can watch these changes

@caibirdme no, this is not designed for standalone mode.

Are you using standalone mode? I want to understand your use case

sober-wang commented 1 year ago

I have a new idea.

Since APISIX v3 has added the capability of gRPC-client, some optimizations have been made to the CP/DP deployment model in APISIX v3. So we can apply this model in APISIX Ingress, implement a gRPC server similar to ETCD in APISIX-Ingress-Controller, let it serve as the control plane, and APISIX, which is actually a data plane, connects to it through gRPC.

In this way, the data plane APISIX is exactly the same as normal APISIX, not in Standalone mode, so you can use all the capabilities of APISIX without any modification to APISIX.

WDYT?

look like , the apisix pull a configuretion from apisix-ingress-controller. Apisix team member's will do it , Are you sure?

maybe , I'm misunderstand the means. So can you clarify the direction of data flow?

tao12345666333 commented 1 year ago

Currently, APISIX v3 already supports decouple mode. DP and CP are separate.

CP provides an etcd-like service.

In the new architecture of APISIX Ingress, we only need to let the Ingress controller assume the role of CP. APISIX at DP is only the role of DP.

caibirdme commented 1 year ago

Are you using standalone mode? I want to understand your use case

I'm using apisix in standalone mode to work as the ingress gateway. I don't want to use ingress, because it's only designed for http. And I don't want to deploy an etcd cluster either. Now I have a deployment for 3-10 apisix replicas, and they're configured by apisix.yaml(configmap). When I want to update the apisix.yaml, I just update the configmap in helm chart and upgrade it. After 1 min later, configmap updated in pod, and apisix could watch that change right away. By doing this, I don't need to learn the apisix-ingress crd, I don't need an etcd cluster, I follow the gitops manner, all the changes are managed by git. After reading the apisix docs, I can configure my ingress as both L4 proxy and L7 proxy

mchtech commented 1 year ago

discuss a scenario:

If these four situations are met:

k8s control plane works well
apps are rolling update
"ingress controller" cannot sync ingress rules (or long sync delay) 3.1 ingress controller crashloop 3.2 or their nodes down 3.3 or they cannot connect to apiserver (node network problem) 3.4 or apisix etcd down (old architecture)
k8s/apisix administrators don't notice what happened

data plane (upstream) will reference obsoleted pod ip, which leads to:

app A pod IP is recycled by cni ipam, the redundancy of app A will reduce
or app A pod has been terminated, its IP re-assigned to app B pod: app A will HTTP 404 randomly

how about dp and cp run in same pod architecture? I think it can minimized the risk (3.2, 3.3: only affect corresponding apisix dp, not all apisix dp).

zhuoyang commented 1 year ago

Is there any decision on how to implement this feature? Folks on my current company are willing to spend some engineering time on this

tao12345666333 commented 1 year ago

@zhuoyang I'm glad to hear this news.

Currently, the #1803 plan is to implement an etcd-server.

In fact, there are still many things we need to do. I will write a detailed technical plan and break down the tasks as soon as possible. Hope we can work together to complete this feature.

zhuoyang commented 1 year ago

cool! let's keep in touch

nagidocs commented 1 year ago

can configure my ingress as both L4 proxy and

Can please share the code somewhere how it's setup with standalone mode is done...I mean how you are refering to backend svc's in diff namespaces and passing apisix.yaml config in the deployment, as I can't find that capability in the present helmchat of apisix. Do we need to manually tweak the deploy afterwards ?

seethedoor commented 1 year ago

Great idea! This can elevate apisix-ingress-controller to a core position.

In fact, the current approach is equivalent to storing two sets of data in two etcd instances: one for the Ingress CRD and the other for Apisix's own data.

But when the ingress-controller goes down, it would require re-fetching, generating, and distributing the route entries upon restart. In scenarios with a large number of ingresses, this may bring more pressure on the apiserver? Or may result in longer recovery times?

tao12345666333 commented 1 year ago

In fact, the current approach is equivalent to storing two sets of data in two etcd instances: one for the Ingress CRD and the other for Apisix's own data.

I didn't fully understand your meaning, are you referring to the new architecture or the existing one?

seethedoor commented 1 year ago

In fact, the current approach is equivalent to storing two sets of data in two etcd instances: one for the Ingress CRD and the other for Apisix's own data.

I didn't fully understand your meaning, are you referring to the new architecture or the existing one?

The existing one, and I mean your new design would avoid this. This is the benefit.

tao12345666333 commented 1 year ago

https://github.com/apache/apisix-ingress-controller/releases/tag/v1.7.0

V1.7.0 released with this feature. Thanks all!!! I will close this one.

mfractal commented 1 year ago

thanks @tao12345666333 is there documentation ready around the new feature ?

tao12345666333 commented 1 year ago

@mfractal FYI https://github.com/apache/apisix-ingress-controller/blob/master/docs/en/latest/composite.md

mlasevich commented 11 months ago

I must be missing something, but doesn't APISIX already support reading/monitoring its config from a YAML file, the YAML file can/should be mounted as a ConfigMap by APISIX pods - so all the ingress controller needs to do is monitor ingress/CRD records and update the ConfigMap as necessary. New architecture with mimicking of etcd service seems like WAAAY overkill, no?

mfractal commented 11 months ago

I must be missing something, but doesn't APISIX already support reading/monitoring its config from a YAML file, the YAML file can/should be mounted as a ConfigMap by APISIX pods - so all the ingress controller needs to do is monitor ingress/CRD records and update the ConfigMap as necessary. New architecture with mimicking of etcd service seems like WAAAY overkill, no?

Not if you want to use CRDs

mlasevich commented 11 months ago

Not if you want to use CRDs

Can you please say more?

Is there something in CRDs that is not available in the configuration file in standalone more?

lpiob commented 11 months ago

I partially agree with @mlasevich - since Apisix can be controlled by a configmap, this configmap could be produced and updated by ingress-controller and we would not need a mock etcd. It does not rule out CRDs.

However, I assume that you can't have everything at once and I am able to accept mock-etcd in a transitional state.

I have a question, about the pod configuration. The composite.md documentation shows an example in which one pod has both apisix-ingress-controller and apisix-gateway. This certainly made testing easier, but I'm not sure whether it's intended to be used that way "in production".

The same scheme was duplicated within the actual ingress-controller chart, making the chart from apisix-gateway now unnecessary and causing the ingress-controller to be scaled simultaneously with apisix-gateway.

Was this intended? If not then I am able to prepare a fix for chart.

tao12345666333 commented 11 months ago

It seems that there are still some doubts about this, so let me answer the questions involved.

1. Why not use standalone mode directly?

We know that APISIX has a standalone mode (obviously), but APISIX's standalone mode is only a subset of its functions. It cannot complete all the capabilities of APISIX (especially the important dynamic ability for production environment). This is also why we spent a lot of time and energy to implement an etcd-mocked server. We hope to provide the complete capability of APISIX.

2. About Pod configuration

The essence of this architecture is to simplify deployment, eliminate the need for users to maintain etcd services, and make scaling easier. Therefore, in the current state, we recommend deploying them in the same Pod.

3. About Helm chart

As the configuration of APISIX is required in this mode, the configuration file of APISIX has been added. Similarly, not all configurations of APISIX are fully supported at this stage. Of course, even with the introduction of these new architecture features, we will not abandon the existing architecture because it has its advantages. Therefore, we will not disrupt any existing deployment methods or Helm charts.

We look forward to hearing more feedback and test results from the community.

lpiob commented 10 months ago

About Pod configuration The essence of this architecture is to simplify deployment, eliminate the need for users to maintain etcd services, and make scaling easier. Therefore, in the current state, we recommend deploying them in the same Pod.

I understand the intention of these changes - they simplify a lot, but at the same time are losing a LOT of functionality that we had with a separate Apisix deployment.

For example:

the apisix-ingress-controller helm chart does not allow to set resources limits/requests for the apisix-gateway
configuration values like plugins, customPlugins, pluginAttrs, configurationSnippet are completely missing from the new chart.

Is the desired way forward to add this missing functionality back to apisix-ingress-controller helm chart?

mfractal commented 10 months ago

@mfractal FYI https://github.com/apache/apisix-ingress-controller/blob/master/docs/en/latest/composite.md

Sir, thank you! i am finally getting to this now :) is it possible to perform the installation of this configuration via helm chart or only via the instructions provided in the link ?

nightguide commented 9 months ago

Hi all!

I want to share my thoughts on this matter. The idea of composite architecture is good, but it has certain disadvantages.

Dataplane in composite mode only receives changes to its configuration, but does not change configuration of Kubernetes resources: Deployment, Services, ConfigMaps, etc.

For example:

1) No possibility to change the number of Dataplane replicas via Apisix Ingress Controller Custom Resource (CR)

2) No possibility to configure an additional TCP/UDP proxy, which will additionally require a change in the Service resource K8S

3) No possibility to connect external configurations from ConfigMap

4) No possibility to easily and simply configure and update version apisix on standalone instances

I think it would be nice to look towards the Operator SDK, which would allow the use of CR to configure and deploy standalone instances with Apisix.

This approach will not only allow you to configure standalone instances, but will also allow you to manage Kubernetes resources for Apisix.

I think this approach will be more Kubernetes native

Have you thought about this?

apache / apisix-ingress-controller

proposal: New architecture of Apache APISIX Ingress controller #610