cloudfoundry / diego-notes

Diego Notes
Apache License 2.0
23 stars 7 forks source link

Generalize Routing in Diego #8

Closed onsi closed 9 years ago

onsi commented 9 years ago

I've added a draft proposal around extending Diego's routing support: https://github.com/pivotal-cf-experimental/diego-dev-notes/blob/master/proposals/routing.md

(let's discuss!)

jbayer commented 9 years ago

Now requests to admin.foo.com will be balanced across all containers, but requests to 0.admin.foo.com will only go to the container at index 0.

does this imply some numeric domains may have an issue with this feature? for example, what if my sub-domain is 1 and my domain is example.com for 1.example.com. would the system be able to figure this out such that 0.1.example.com should be a match for instance index 0 but that 1.example.com should load balance?

fraenkel commented 9 years ago

In your example, { 4000: ["foo.com", "bar.com"], 5000: ["admin.foo.com"] }

Are these hostnames or domain names? When I get a request for a.foo.com, do I go to port 4000 or fail?

Given the stories that are already lined up for the Gorouter, it would seem that more data will have to flow with some of these definitions. Given that the structure is untyped, how would I introduce https only for one of those routes? Just thinking about extensibility.

jbayer commented 9 years ago

@fraenkel Are these hostnames or domain names? When I get a request for a.foo.com, do I go to port 4000 or fail? I believe the answer is 404.

cf-buildpacks-eng commented 9 years ago

Hi guys. What follows is informed by my background as a relational database bigot.

With Routes, is there a reason we would use plain strings instead of URIs?

I realise that plain strings gives users the most flexibility, but in theory any routable object or document can be addressed with a URI. URIs also give structure, type checking and surrounding utilities.

And they will make it harder to abuse what is essentially a typeless, presumably unbounded K-V store. I'm always wary of stringly typed systems. :|

Cheers,

JC.

On Thu, Jan 8, 2015 at 12:05 AM, James Bayer notifications@github.com wrote:

@fraenkel https://github.com/fraenkel Are these hostnames or domain names? When I get a request for a.foo.com http://a.foo.com, do I go to port 4000 or fail? I believe the answer is 404.

  • how would I introduce https only for one of those routes?* yes, an https-only route is one potential future attribute. other potential future attributes on a route include a "route service" (ask dieu) and tcp (not sure how this would be expressed yet).

— Reply to this email directly or view it on GitHub https://github.com/pivotal-cf-experimental/diego-dev-notes/issues/8#issuecomment-69136506 .

d commented 9 years ago

JC: it's unclear how URI's would help routing. Specifically, the URI spec does not have the routing use case in mind, it presumes that nodes are "addressed" in the host part without requiring protocols (or "scheme"s in URI) to contain the host information. For example, tcp://jesse.awesome.cloud:8088 will be unroutable. And to stop derailing the more important conversation here, I can chat with you offline...

Jesse

onsi commented 9 years ago

@jbayer re subdomains

the naive implementation would be to explicitly tell the router that:

example.com => [ip_0:port_0, ip_1:port_1]
0.example.com => [ip_0:port_0]
1.example.com => [ip_1:port_1]

So if a user had a subdomain that looked like 1.example.com the router would be told:

1.example.com => [ip_0:port_0, ip_1:port_1]
0.1.example.com => [ip_0:port_0]
1.1.example.com => [ip_1:port_1]

The problem that could arise is if the user with example.com opted into the per-index routing and then some other user came along and requested 1.example.com -- that would be a collision and the behavior would be undefined (i.e. bad). The onus is on the consumer (CC) to guard against these sorts of collisions and refuse honoring a request for 1.example.com. Since this isn't a CF-oriented feature (yet) it's probably OK? If we do end up wanting this then CC will have to be taught about these collisions.

onsi commented 9 years ago

@fraenkel I agree re extensibility

The beauty of this (IMO) is that Diego doesn't care about the schema at all. It's purely a contract between two Diego consumers (CC and Router) and Diego simply acts as a dumb conduit.

We could go whole hog:

{
    {"port": 4000, "route": "foo.com"},
    {"port": 4000, "route": "bar.com", "ssl": true},
    {"port": 5000, "route": "admin.foo.com", "route_to_instance": true},
    {"port": 1138, "tcp": true, "incoming_port": 62312}
}

(the tcp example is for James -- we'd talked through some ideas around maybe checking out a port and then having the router know that a given incoming port maps onto a given application).

I can update the doc with this if we like it. It's verbose but substantially more future-proof.

We'll also need to enforce reasonable limits on how large these entries can be...

onsi commented 9 years ago

And yes a.foo.com would be a 404. That's how the router works today and this isn't changing that behavior.

frodenas commented 9 years ago

I'm confused about the cf-router's schema in DesiredLRP.Routes. At the original proposal there is an example like this:

"cf-router":{
    4000: {"routes": ["foo.com", "bar.com"]},
    5000: {"routes": ["admin.foo.com"], "route_to_instances": true},
}

But later in this comment I see:

"cf-router":{
    {"port": 4000, "route": "foo.com"},
    {"port": 4000, "route": "bar.com", "ssl": true},
    {"port": 5000, "route": "admin.foo.com", "route_to_instance": true},
    {"port": 1138, "tcp": true, "incoming_port": 62312}
}

I'm in favor of the later approach, having the port as the hash key gives less flexibility, for example if one day we decide to route all LRP ports instead of a specific port.

Regarding the tcp key that appears at the previous example, I will prefer instead to add a protocol key. This will allow us to support also udp:

"cf-router":{
    {"port": 4000, "route": "foo.com"}, # no "protocol" implies "tcp"
    {"port": 4000, "route": "bar.com", "protocol": "udp"},
}

Another concern about the routes is what happens if I assign the same route to different ports for the same LRP:

"cf-router":{
    {"port": 4000, "route": "foo.com"},
    {"port": 5000, "route": "foo.com"},

Is the router going to load balance the requests to the same LRP but on different ports, or should this be a CC constrain?

fraenkel commented 9 years ago

I think the part that isn't/wasn't clear is that the key cf-router is a hard requirement, the value can be whatever. There seems to be a minimal requirement to provide some correlation information so the route emitter (whatever it is) can do an appropriate job.

onsi commented 9 years ago

@fraenkel is correct. The cf-router key is how the router/route-emitter knows "this is for me". The value for that key can have a schema that Diego doesn't care about.

@frodenas:

onsi commented 9 years ago

the doc has been updated with these changes

onsi commented 9 years ago

If we're happy with this direction I'll write some stories later today.

onsi commented 9 years ago

We should limit the size of the values in the Routes hash. I've updated the doc with an arbitrary limit of 4K for the value. This value was derived using science.

onsi commented 9 years ago

Accepted this proposal, added stories: https://github.com/pivotal-cf-experimental/diego-dev-notes/blob/master/accepted_proposals/routing.md