futurewei-cloud / alcor

Alcor: Cloud native SDN platform powered by Kubernetes and Istio
MIT License
32 stars 33 forks source link
cloud-native control-plane kubernetes management-plane microservices sdn-controller

"This Project has been archived by the owner, who is no longer providing support. The project remains available to authorized users on a "read only" basis."

Build Status codecov License: MIT GitHub release Percentage of issues still open Average time to resolve an issue

Alcor

A Hyperscale Cloud Native SDN Platform

In this README:

Introduction

Cloud computing means scale and on-demand resource provisioning. As more enterprise customers migrate their on premise workloads to the cloud, the user base of a cloud provider could grow at a rate of 10X in just a few years. This will require a cloud virtual networking system with a more scalable and extensible design. As a part of the community effort, Alcor is an open-source cloud native platform that provides high availability, high performance, and large scale virtual networking control plane and management plane at a high resource provisioning rate.

Alcor leverages the latest SDN and container technologies as well as an advanced distributed system design to support deployment, configuration and scale-out of millions of VM and containers. It is built based on a distributed micro-services architecture with a uniform way to secure, connect, and monitor control plane micro-services, and fine-grained control of service-to-service communication including load balancing, retries, failovers, and rate limits. Alcor also offers a way to unify VM and container networking management, and ensures ultra-low latency and high throughput due to its application aware fast path when provisioning containers and serverless applications.

The following diagram illustrates the high-level architecture of Alcor control plane.

Alcor architecture

Detailed design docs:

Key Features

Cloud-Native Architecture

Alcor leverages Kubernetes and Istio to build its distributed micro-services architecture. Depending on the control plane load, Alcor Controller scales out with multiple instances and each instance is a Kubernetes application. One step further, each application contains various infrastructure microservices to manage different types of network resources.

Throughput-Optimal Design

Alcor focuses on top-down throughput optimization on every system layer including API, Controller, messaging mechanism, and host agent. For example, a batch API is provided to support deploying a group of ports with a single POST call, and a message batching mechanism is proposed on a per-host basis, which is capable of driving groups (potentially thousands) of resources to the same host in one shot.

Fast Resource Provisioning

To support time-critical applications, Alcor enables a direct communication channel from Controller to Host Agent. This channel bypasses a message queueing system like Kafka, and utilizes gRPC to offer 10x latency improvement compared to Kafka.

Planned Features

A list of planned features is included our current roadmap. Some highlighted items:

  1. Major VPC features (e.g., security group, ACL, QoS)
  2. Controller services break-down
  3. Compatibility with OVS
  4. Controller grey release
  5. Performance comparison with Neutron and many more...

Repositories

The Alcor project is divided across a few GitHub repositories.

Directory Structure

This main repository of Alcor Regional Controller is organized as follows: