aws / aws-app-mesh-roadmap

AWS App Mesh is a service mesh that you can use with your microservices to manage service to service communication
Apache License 2.0
347 stars 25 forks source link

Feature Request: Running App Mesh outside of docker #131

Open ddelnano opened 5 years ago

ddelnano commented 5 years ago

Tell us about your request Our current service mesh is agnostic to how workloads are run and integrates nicely with containerized and non containerized services. Because of this we have a large number of instances that do not have docker installed that use our service mesh. Our service mesh runs outside of docker even for containerized workloads (deployed on every EC2 instance bound to a network interface docker containers can access). As a result, App mesh's deployment, which recommends running a docker container, will not work well for our use case.

Since App mesh is changes on top of envoy project that are planned to be upstreamed, it seems that it shouldn't be that difficult to make this possible once that occurs. We already package our own envoy and so if its configuration was documented this could probably solve our problem.

Which integration(s) is this request for? EC2

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? We are looking to see if App mesh would work for our use case. It is hard because it would require installing docker on many instances that we don't want running docker. It could also mean that we would have to rearchitect how we deploy our service mesh (running container sidecars rather than deploying it one per host).

Are you currently working around this issue? No. We are currently evaluating App mesh to see if it would work as a replacement for our current service mesh (smartstack).

dastbe commented 5 years ago

Hey @ddelnano!

We certainly support any compute target that you can throw an Envoy onto, though our documentation outside of ECS/k8s is sparse :(.

And while we haven't done much explicit documentation of our Envoy build/release/configuration, the mechanism for an Envoy connecting to App Mesh is very well standardized. We've upstreamed support for using aws credentials to authenticate to our ADS endpoint, which will support all conventional forms of credential discovery.

The minimum viable bootstrap configuration for an Envoy you'll need will look like

node:
    id: /mesh/$MESH_NAME/virtualNode/$VIRTUAL_NODE_NAME
    cluster: /mesh/$MESH_NAME/virtualNode/$VIRTUAL_NODE_NAME

dynamic_resources:
  # Configure Envoy to get listeners and clusters via GRPC ADS
  ads_config:
    api_type: GRPC
    grpc_services:
      google_grpc:
        target_uri: appmesh-envoy-management.$AWS_REGION.amazonaws.com:443
        stat_prefix: ads
        channel_credentials:
          ssl_credentials:
            root_certs:
              filename: /etc/pki/tls/cert.pem
        credentials_factory_name: envoy.grpc_credentials.aws_iam
        call_credentials:
          from_plugin:
            name: envoy.grpc_credentials.aws_iam
            config:
              region: $AWS_REGION
              service_name: appmesh
  lds_config: {ads: {}}
  cds_config: {ads: {}}

Additionally, if you want to configure iptables to work as they when running on ECS or k8s, I would explore the following script in our proxy manager image that we use for deploying to k8s

docker run -it --rm --entrypoint cat 111345817488.dkr.ecr.us-west-2.amazonaws.com/aws-appmesh-proxy-route-manager:v2 bin/sidecar-proxy-route-manager.sh
ddelnano commented 5 years ago

@dastbe thanks for that context. I did see that script in the docs (although I didn't realize it was that particular one until you pointed it out).

Unfortunately that iptables script seems to assume we want all the network traffic to route through the envoy container. This isn't what I'm trying to do since I'd like most of the network traffic on my ec2 instances to work as is. Can you explain what iptables rules would be necessary or can I ditch the rules entirely and have the service make requests to envoy? From the docs I linked to above it seems I want to make a request to port 15000.

# script truncated from https://docs.aws.amazon.com//app-mesh/latest/userguide/appmesh-getting-started.html (should be more or less the same as bin/sidecar-proxy-route-manager.sh
APPMESH_ENVOY_EGRESS_PORT="15001"
APPMESH_ENVOY_INGRESS_PORT="15000"

I tried to do this but just receive a connection error.

$ curl -H 'Host: serviceb'  localhost:15000/testing
curl: (56) Recv failure: Connection reset by peer
dastbe commented 5 years ago

Let me do some research, but my expectation is that direct calls to Envoy won't work as-is today. In the configuration we vend to Envoy, we first start by recovering the original socket destination (so we can recover information "lost" on the iptables redirect) followed by a port match on the destination. So if you're App Mesh service isn't defined at 15000 (the egress routes are actually on 15001, but that doesn't change the core issue), it won't be routed.

Depending on how selective you need to be, you can set APPMESH_EGRESS_IGNORED_IP to a comma-delimited list of ips or cidr ranges, and similiar for APPMESH_EGRESS_IGNORED_PORTS.

If that doesn't work for you, I expect you would be able to do selective iptables routing to the Envoy. Since we're not trying to trap all traffic, I think we can instead define an "Envoy IP" that's configured to route into Envoy. Since the Envoy isn't send traffic back to this IP, we don't need any of the other rules that ensure Envoy avoids routing back to self.

NOTE: I've done a little testing w/ this and it does appear to work, but very much so caveat emptor

    # Environment Variables
    APPMESH_ENVOY_EGRESS_PORT=15001
    APPMESH_LOCAL_ROUTE_TABLE_ID="100"
    APPMESH_PACKET_MARK="0x1e7700ce"
    ENVOY_DESTINATION_IP=127.1.33.7

    # Initialization
    iptables -t nat -N APPMESH_EGRESS
    ip rule add fwmark "$APPMESH_PACKET_MARK" lookup $APPMESH_LOCAL_ROUTE_TABLE_ID
    ip route add local default dev lo table $APPMESH_LOCAL_ROUTE_TABLE_ID

    # Redirect everything on this chain to the envoy egress port
    iptables -t nat -A APPMESH_EGRESS \
        -p tcp \
        -j REDIRECT --to $APPMESH_ENVOY_EGRESS_PORT

    # Apply APPMESH_EGRESS chain to all traffic destined for the Envoy's "ip"
    iptables -t nat -A OUTPUT \
        -p tcp \
        -d "$ENVOY_DESTINATION_IP" \
        -j APPMESH_EGRESS

Which should make it such that any traffic destined for 127.1.33.7:PORT would be redirected to the running envoy at localhost:15001, at which point Envoy would recover the original destination port and process traffic correctly.

I'll perform more testing in a bit, but wanted to give you my initial stab at what I think would work.

kubrickfr commented 2 months ago

Hi there,

@dastbe , has aws_iam been tested with recent versions of envoy (1.31)? Is there a new minimum viable bootstrap configuration for envoy that you would suggest? I have tried adapting your version but it fails silently, doesn't even make a call to the metadata server (added logging with nftables).

example

dynamic_resources:
  # Configure Envoy to get listeners and clusters via GRPC ADS
  ads_config:
    api_type: GRPC
    grpc_services:
      google_grpc:
        target_uri: appmesh-envoy-management.$AWS_REGION.amazonaws.com:443
        stat_prefix: ads
        channel_credentials:
          ssl_credentials:
            root_certs:
              filename: /etc/pki/tls/cert.pem
        credentials_factory_name: envoy.grpc_credentials.aws_iam
        call_credentials:
          from_plugin:
            name: envoy.grpc_credentials.aws_iam
            typed_config:
              "@type": type.googleapis.com/envoy.config.grpc_credential.v3.AwsIamConfig
              service_name: appmesh
              region: eu-west-1
kubrickfr commented 2 months ago

Please ignore my previous message, I was in fact facing this issue: https://github.com/envoyproxy/envoy/issues/35940