megaease / easemesh

A service mesh implementation for connecting, control, and observe services in spring-cloud.
https://megaease.com/easemesh
Apache License 2.0
508 stars 61 forks source link

[service registry discovery]support EG as the easemesh service registry center #1

Closed benja-wu closed 3 years ago

benja-wu commented 3 years ago

Background

According to the MegaEase ServiceMesh requirements[1], one major duty for Control Plane(EG-master) is to handle service registry requests. Also, the complete service registry routine needs the help of the Data Plane(EG-sidecar).

Proposal

Registry metadata

{
   // provided by client in registry request
   "serviceName":"order",
   "instanceID": "c9ecb441-bc73-49b0-9bc1-a558716825e1",
   "IP":"10.168.11.3",
   "port":"63301",

   // find in meshService spec
   "tenant":"takeaway“

   // depends on instance heartbeat, can be modify by API
   "status":"UP",
   //  has default value, can be modify by API
   "leases":1929499200,
   // recorded by system, read only
   "registryTime": 1614066694
}

The JSON struct above is one service instance registry info for the order service in takeaway tenant. It has a UUID. By default, its leases will be available for ten years. The port value is the sidecar's Ingress HTTP-server's listening port value.

ETCD data layout

* How many services and their instance for one tenant in mesh? Say we have one tenant call *tenant-001* and it has two services, one is **order**, the other is **address**:

$./etcdctl get "/mesh/tenants" --prefix tenant-001 {"desc":"this is a demo tenant","createdTime": 1614066694} $ ./etcdctl get "/mesh/tenants/tenant-001" ["order","address"]



4. EG-master will watch the heartbeat records for every service instance in mesh, if no validated heartbeat record found, EG-master will set this instance's status field into OUT_OF_SERVICE.

### Data Plane
1. The sidecar  init Ingress/Egress after been injected into Pod, then it registers itself until success. 
2. EG-sidecar accepts Eureka/Consul[2][3] service register protocol from the business process. EG-sidecar don't depend on the business process' register request.
3. sequence diagram
![Service-Registry-Register Sequence](https://user-images.githubusercontent.com/9316473/112252214-a5877380-8c97-11eb-94e8-c465b1b4f9ac.png)
4. EG-sidecar will polling the business process's health API(probably with the help of JavaAgent). Then report this heartbeat into ETCD.
5. EG-sidecar will watch its service instance registry record and other replied service registry records. Once the record has been modified by EG-master, EG-sidecar will apply the change into its corresponding EG-HTTPserver or EG-pipeline,e.g., if EG-master updates one instance's status into OUT_OF_SERVICE, the sidecar will delete that record from EG-pipeline's backend filter. 

## Reference
[1] mesh requirements https://docs.google.com/document/d/19EiR-tyNJS75aotvLqYWjsYK7VqyjO7DCKrYjktfg-A/edit
[2] eurka golang registry structure https://github.com/ArthurHlt/go-eureka-client/blob/3b8dfe04ec6ca280d50f96356f765edb845a00e4/eureka/requests.go#L38
[3] consul catalog registry structure https://pkg.go.dev/github.com/hashicorp/consul/api@v1.7.0#CatalogRegistration
zhao-kun commented 3 years ago

I have a suggestion:

in the data plane, EG-sidecar, when it receives a registry request, it doesn't register to the Etcd immediately. The sidecar just returns a successful result to the real service no matter what's the result of the etcd registration. The sidecar registration will be designed as the level trigger design, it's an asynchronous registration.


ps: s/pooling/polling/g

benja-wu commented 3 years ago

Got it. EG-sidecar's asynchronous registration will be more resilient when the network inside a Pod becomes unstable or something else happens.

ps: replacing done.

benja-wu commented 3 years ago

After discussion with @xxx7xxxx and @zhao-kun , the original ETCD storage layout

        meshServicesPrefix              = "/mesh/services/%"                // +serviceName (its value is the basic mesh spec)
    meshServicesResiliencePrefix    = "/mesh/services/%s/resilience"    // +serviceName(its value is the mesh resilience spec)
    meshServicesCanaryPrefix        = "/mesh/services/%s/canary"        // + serviceName(its value is the mesh canary spec)
    meshServicesLoadBalancerPrefix  = "/mesh/services/%s/loadBalancer"  //+ serviceName(its value is the mesh loadBalance spec)
    meshSerivcesSidecarPrefix       = "/mesh/serivces/%s/sidecar"       // +serviceName (its value is the sidecar spec)
    meshServicesObservabilityPrefix = "/mesh/services/%s/observability" // + serviceName(its value is the observability spec)

will be merged into one spec in ETCD

        meshServicesPrefix              = "/mesh/services/%"
benja-wu commented 3 years ago

Finished and merged into EG mesh branch already.