radius-project / radius

Radius is a cloud-native, portable application platform that makes app development easier for teams building cloud-native apps.
https://radapp.io
Apache License 2.0
1.49k stars 96 forks source link

Improvements to networking, routes, and hostnames #6375

Open jkotalik opened 2 years ago

jkotalik commented 2 years ago

Note: this version of the top comment was written by @rynowak based on the original text from @jkotalik. You can see the original by navigating to the history.

Making networking simpler

The goal of this document is to provide an alternative way of specifying directions of communication and connections without overloading the user with extra resouce types.

This covers both east-west (within the app) and north-south (to the internet) networking. Also of note, we're talking about user-defined communication between the application's components. Communication with cloud resources like databases is already served well by existing Radius features, and is out of scope for this proposal.

We have two primary goals for the networking features described here:

What is hard today

Radius has evolved significantly over time to go from having support for cycles of communication, to adding routes to fix cycles, to additional route types, gateway support, and more.

We have separate route resource types for HTTP routes and Dapr Service Invoke routes. We haven't yet defined routes for other protocols like gRPC, HTTPS, or TLS but they are gaps based on the current design. These route resources are used to 'link' containers with the things that connect to them.

We have recieved extensive feedback on this:

What we've heard about this is that for an HTTP-based app users will need to define two resources per-container, the container and the route. This doesn't feel good. In a comparison with similar or competing technologies like Radius, they define their neworking model without a resource like routes.

We believe that there can be some awesome improvements to networking that can make getting started with Radius super easy.

Thought Experiment

What if we removed the requirement to have routes? How would two containers communicate with each other? Let's start with the following example with a frontend and backend definition with each wanting to connect to each other.

To start us off here's an example using the current design:

Click to expand ```bicep resource frontend 'Applications.Core/containers@2022-03-15-privatepreview' = { name: 'frontend' location: 'global' properties: { application: app.id container: { image: 'frontend' ports: { web: { containerPort: 80 } } } connections: { backend: { source: backendRoute.id } } } } resource backendRoute 'Applications.Core/httpRoutes@2022-03-15-privatepreview' = { name: 'backend' properties: { port: 80 } } resource backend 'Applications.Core/containers@2022-03-15-privatepreview' = { name: 'backend' location: 'global' properties: { application: app.id container: { image: 'backend' ports: { web: { containerPort: 80 provides: backendRoute.id } } } } } ```

This demonstrates our two goals for the feature.


Let's see if we can simplify this design without losing this information.

We can do this by removing routes and enabling a connection directly to containers:

Click to expand ```bicep resource frontend 'Applications.Core/containers@2022-03-15-privatepreview' = { name: 'frontend' location: 'global' properties: { application: app.id container: { image: 'frontend' ports: { web: { containerPort: 80 } } } connections: { backend: { source: backend.id route: 'web' // Specifying the route name is optional } } } } resource backend 'Applications.Core/containers@2022-03-15-privatepreview' = { name: 'backend' location: 'global' properties: { application: app.id container: { image: 'backend' ports: { web: { containerPort: 80 } } } } } ```

Now we removed the need for routes as a concept. We still accomplish Service Discovery and Documenting the Communication through the link between frontend -> backend.

However, we've introduced another problem. Now frontend has a deployment-time dependency on backend - eg: backend must be deployed before frontend. This is problematic because it's slower than necessary and because it cannot support cycles. For a small application like the sample that we're working with, this are not actually issue and you may actually prefer this approach.

The advantage of linking via resource IDs is that all of the connections are strongly validated by the control-plane. However, as applications grow in scale (number of containers) you start to care more about the deployment time (O(n) where n is the shortest communication path) and cycles become much harder to reason about.

We need an alternative approach that does not influence the deployment order of the resources. This is actually the problem that we introduced routes to solve in the first place. The extra indirection provided by routes means that they don't influence the deployment order between containers. The indirect nature is also why users find them confusing.

Relying on DNS-based Service Discovery

We have a better option that users are already familiar with, and if done right it can meet all of our goals - DNS-based Service Discovery.

DNS-based Service Discovery is already commonly used and well understood in practice for microservices applications.

Hostnames are an application-defined logical name so most developers are comfortable hardcoding them in code or deployment assets. This allows an application to be loosely-coupled.


In the Radius context, we already allocate hostnames (using Kubernetes Services) for every container that defines a route. Our existing route-based service discovery mechanism is actually a double-indirection - we allow you to look up the DNS hostname rather than hardcoding it.

However, we can encourage users to build directly on DNS-based service discovery as a way of decoupling deployments. This is already a standard practice and is leveraged by our comparable and competing projects.

The unique twist that we can add to Radius is to support connections via DNS names. This allows applications to remain decoupled while still documenting their communication patterns.

Here's an example using DNS-based Service Discovery:

Click to expand ```bicep resource frontend 'Applications.Core/containers@2022-03-15-privatepreview' = { name: 'frontend' location: 'global' properties: { application: app.id container: { image: 'frontend' ports: { web: { containerPort: 80 } } } connections: { backend: { source: 'http://backend' } } } } resource backend 'Applications.Core/containers@2022-03-15-privatepreview' = { name: 'backend' location: 'global' properties: { application: app.id container: { image: 'backend' ports: { web: { containerPort: 80 } } } } } ```

Notice that there's no reference between frontend and backend save for the DNS name (in a URL). These containers no longer have a specified deployment order, which means they could be deployed from different Bicep files, CI/CD pipelines, or repositories.

Choosing between Resource-ID-based and DNS-based Service Discovery

Users then have a choice between these two mechanism with only slight tradeoffs.

Resource-ID-based:

DNS-based:

Intution says that beginner users and simple applications will prefer Resource-ID-based service discovery because it's easy to get right. Advanced users or those building large scale (number of microservices) will gravitate towards DNS-based service discovery. It's likely that these users already have experience towards this technique and will simple default to it.

Summary of Changes

Supporting URLs and Hostnames in connections

This proposal adds support for URLs and Hostnames as a source in the connections section of a compute resource. This allows a compute resource (container) to document a connection that refers to:

Examples:

resource frontend 'Applications.Core/containers@2022-03-15-privatepreview' = {
  name: 'frontend'
  location: 'global'
  properties: {
    ...
    connections: {
      backend: { // Referencing a microservice in the same application
        source: 'http://backend' 
      }
      another-app: { // Referencing a microservice in another application
        source: 'https://inventory.${anotherapp.properties.dnsSuffix}'
      }
      some-external-saas: { // Referencing arbitrary external URL
        source: 'mytenant.externalsaas.contoso.com'
      }
    }
  }
}

In the case of a connection to another application, the user is responsible for crafting the right hostname/URL. See the next section.

Based on the type of the source (URL or Hostname) we'll inject a standard set of environment variables. The reason for supporting both URLS and Hostnames is to be flexible. We want to encourage users to document what they can or what they are willing to. For some kinds of networking communication hostnames are more often used by SDKs than URLs are. For example, Dapr service invocation uses a hostname-like construct but doesn't use URLs.

Source Type Example Environment Variables
URL http://backend - CONNECTIONS_<name>_SCHEME=http
- CONNECTIONS_<name>_HOST=backend
- CONNECTIONS_<name>_PORT=80
- CONNECTIONS_<name>_URL=http://backend
Hostname mytenant.externalsaas.contoso.com - CONNECTIONS_<name>_HOST=mytenant.externalsaas.contoso.com

Adding dnsSuffix to Application

As mentioned above, a non-local reference (different application) will require the user to know the correct DNS-suffix. We don't provide this from the API today in a portable way (requires coupling to Kubernetes). To address this we should add a readonly dnsSuffix property to the application resource.

Deploying services as part of containers

Since we're obsoleting routes, we need the creation of a container with ports to create the Kubernetes Service.

This is a simplification that has no drawback in Radius' main scenarios. There are some rare cases where an application author might want to manually create a Service or configure a different type of Service. We should feel confident addressing these scenarios with additional configuration knobs or a way to opt-out based on feedback.

Improved container ports

Since we're obsoleting routes, we need to merge their existing functionality into container. The updated design for ports will include the port field (defaults to the value of containerPort) as well as an optional scheme field.

Here's an annotated example of defining a port:

resource backend 'Applications.Core/containers@2022-03-15-privatepreview' = {
  name: 'backend'
  location: 'global'
  properties: {
    application: app.id
    container: {
      image: 'backend'
      ports: {
        web: {
          containerPort: 5000
          port: 80 // optional: only needs to be set when a value different from containerPort is desired
          protocol: 'TCP' // optional: defaults to TCP
          scheme: 'http' // optional: used to build URLs, defaults to http or https based on port
        }
      }
    }
  }
}

We're including more customizable capabilities in the definition of a port now, but the set of required fields is technically the same as today. We expect many users to set containerPort to a framework-determined value (eg: 3000 for Flask) and port to 80 as this is a common pattern for HTTP microservices.

Addition of routes section to compute resoures

Additionally, we should add a new section to compute resources routes as a way to read information about the ports, URLs, and other networking capabilities of the compute resource. The scenario where this is useful is where a microservice wants to read (in Bicep or Terraform) the data associated with another microservice's networking behaviors.

resource frontend 'Applications.Core/containers@2022-03-15-privatepreview' = {
  name: 'frontend'
  location: 'global'
  properties: {
    application: app.id
    container: {
      image: 'frontend'
      env: {
        // Easy to work with routing data programmatically
        LOGIN_URL: '${backend.properties.routes.web.url}/login'
      }
      ports: {
        web: {
          containerPort: 80
        }
      }
    }
    connections: {
      backend: {
        source: 'http://backend'
        route: 'web'
      }
    }
  }
}

In this example the frontend is building a more complex URL based on the URL of backend. This is a companion to the other features we offer. For example

The properties.container.ports section is specialized for how containers define networking, it's not specialized for how a linked microservice discovers those capabilities. Moreover as we define additional compute resources in the future they will have their own unique way of defining their networking capabilities, we need something to act as a uniform interface for reading. Lastly, Dapr (service invocation) defines networking capabilities, but isn't configured as part of properties.container.ports - however it should still be part of our service discovery interface - we may see more examples of this pattern.

The most important consumer of routes is Radius itself. When a connection is defined between containers, Radius will use the information in routes to power the service discovery contract. This allows different kinds of compute resources, and different sources of routing information to remain decoupled. For example, imagine using a hypothetical 3rd party AWS Lambda integration RP - a container can connect to a lambda if it can provide routes without the Radius container having any prior knowledge of lambda. This is what is meant by uniform interface.

Here's an example of what routes would look like:

Note: this is API output in JSON (not Bicep)

{
  "name": "backend",
  "type": "Applications.Core/containers",
  "properties": {
    "container": { ... },
    "routes": {
      "web": {
        "port": 80,
        "protocol": "TCP",
        "scheme": "http",
        "hostname": "backend.icecream-store.cluster.svc.local",
        "url": "http://backend.icecream-store.cluster.svc.local"
      }
    }
  }
}

Note: UCP does not currently track Capabilities for resource types. Capabilities is an important enabling feature for us to build the uniform interfaces that will enable extensibility.

Supporting connections to compute resources

connections will support creating a connection from a compute resource to another compute resource. This is similar to how routes are used today.

Since a compute resource can publish multiple routes a connection can choose to either specify a named route or not. As with the behavioral differences between a URL-based connection and Hostname-based connection, we will be flexible in what we accept.

Example specifying a route:

resource frontend 'Applications.Core/containers@2022-03-15-privatepreview' = {
  name: 'frontend'
  location: 'global'
  properties: {
    container: { ... }
    connections: {
      backend: {
        source: backend.id
        route: 'web' // Optional
      }
    }
  }
}

Example without specifying a route:

resource frontend 'Applications.Core/containers@2022-03-15-privatepreview' = {
  name: 'frontend'
  location: 'global'
  properties: {
    container: { ... }
    connections: {
      backend: {
        source: backend.id
      }
    }
  }
}

Similarly to the differences between URLs and Hostnames, the environment variables injected by Radius will also differ, and provide as much information as possible.

Source Type Environment Variables
With Route - CONNECTIONS_<name>_SCHEME=http
- CONNECTIONS_<name>_HOST=backend
- CONNECTIONS_<name>_PORT=80
- CONNECTIONS_<name>_URL=http://backend
Without Route - CONNECTIONS_<name>_HOST=backend

Changes to gateways

The gateway resource enables north-south communication (from the internet). Gateways are integrated with routes today and so need to change to accomodate this proposal.

Gateways should support flexible options like connections do. Gateways will support both referencing a specific port of a container via resource ID (similar to current functionality) and referencing a hostname:port (authority) in string form.

Here's an example showing a gateway using both a resource ID and authority for destination routes.

resource gateway 'Applications.Core/gateways@2022-03-15-privatepreview' = {
  name: 'gtwy-gtwy'
  location: location
  properties: {
    application: app.id
    routes: [
      {
        path: '/'
        destination: frontend.id
        route: 'web'
      }
      {
        path: '/backend'
        destination: 'backend:80'
      }
    ]
  }
}

Removing existing route resources

Once the above changes are implemented then the existing route types can be removed wholesale. We'll bring back routes with an operational focus when we're ready to build features like traffic splitting.

### Tasks
- [ ] https://github.com/project-radius/radius/issues/6099
AaronCrawfis commented 2 years ago

Just chatted with @rynowak and thought through the following:

Proposals

  1. Change Applications.Core/containers to Applications.Core/services, which now models a long-running, stateless, container.
  2. Move the lifecycle of the Kubernetes service(s) to be the same as the Applications.Core/services port(s).
  3. Change Applications.Core/httpRoutes to be configuration of the Applications.Core/services port(s) and Kubernetes service(s), instead of owning the lifecycle of the Kubernetes service.
  4. Add a service property within the connection properties of Applications.Core/services, which accepts a string
    • The schema of this string will contain container name, app name, port, and protocol
    • For example, backend:80/http

Example: simple case

resource backend 'Applications.Core/services@2022-03-15-privatepreview' = {
  name: 'backend'
  location: 'global'
  properties: {
    application: app.id
    container: {
      image: 'backend'
      ports: {
        web: {
          containerPort: 5000
          port: 80
        }
      }
    }
  }
}

resource frontend 'Applications.Core/services@2022-03-15-privatepreview' = {
  name: 'frontend'
  location: 'global'
  properties: {
    application: app.id
    container: {...}
    connections: {
      backend: {
        service: 'backend:80/http' 
      }
    }
  }
}

Grow-up story

But what about traffic splitting, dependency cycles, etc.?

We could still use HTTP Routes as they're used today to provide configuration and connections:

resource backendRoute 'Applications.Core/httpRoutes@2022-03-15-privatepreview' = {
  name: 'backend-route'
  location: 'global'
  properties: {
    application: app.id
    properties: {
      // Traffic splitting configuration
    }
  }
}

resource backend 'Applications.Core/services@2022-03-15-privatepreview' = {
  name: 'backend'
  location: 'global'
  properties: {
    application: app.id
    container: {
      image: 'myimage'
      ports: {
        web: {
          containerPort: 5000
          port: 80
          provides: backendRoute.id
        }
      }
    }
  }
}

resource frontend 'Applications.Core/services@2022-03-15-privatepreview' = {
  name: 'frontend'
  location: 'global'
  properties: {
    application: app.id
    container: {
      image: 'frontend'
    }
    connections: {
      service: {
        source: backendRoute.id
        // service: 'frontend:80/http` still works as well, because the Kubernetes service hasn't changed
      }
    }
  }
}
jkotalik commented 2 years ago

I think this is a great start and generally reflects the general direction we'd like to have in bicep. I think we need to do some more investigation from the dev side to understand some of these implications.

rynowak commented 2 years ago

For posterity @jkotalik and I also spent some time discussing routes and whether we could do funny things with their lifecycle as a solution. We came up with some plausible ideas but nothing amazing, lots of open questions in how it would work.

I think the ideas where routes are an optional feature are going to work better than getting weird with route lifecycles.

rynowak commented 1 year ago

Working on an update here. @AaronCrawfis and I spent some time refining these ideas and narrowing down the field of options. I'm going to update the main comment since right now it documents a bunch of different ideas, and we're landing on a few specific ones.

Reshrahim commented 1 year ago

Summarizing the changes targeted for public release

Users can hardcode/construct the URL as env variables for now. Connections integration is needed for the application graph which is not prioritized for public release and hence it can be done post public release

Post public release

rynowak commented 1 year ago

This is not done.

AaronCrawfis commented 1 year ago

Reopening till the docs are merged

rynowak commented 1 year ago

@farazmsiddiqi 's project was to do about half the work described here. We should keep this open even after the docs are updated.

nithyatsu commented 1 year ago

@AaronCrawfis @rynowak Gateway resource now accepts destination as URL. The good side to this is that there is no dependency that container and service should be created first and then the gateway (since we do not use a resource ID ). The downside is that the service might not exist at the time of creation of gateway causing gateway validation to fail ( reporting a service not found). We can get around this issue by injecting a dependency like below like @vinayada1 suggested.

Is this approach OK, or should we ignore service not found error, so that we have increased parallelization?

resource gateway 'Applications.Core/gateways@2022-03-15-privatepreview' = {
  name: 'ssl-gtwy-gtwy'
  location: location
  properties: {
    application: app.id
    tls: { 
      sslPassthrough: true 
    } 
    routes: [
      {
        destination: 'https://${frontendContainer.name}:${frontendContainer.properties.container.ports.web.port}' #this is so that gateway httpproxy objects are created after the container and service is created. Otherwise, we could have directly plugged in the name and port.
      }
    ]
  }
}