tellapart / aurproxy

Load balancer manager with knowledge of Apache Aurora's service discovery mechanism and integration with Aurora's task lifecycle.
Apache License 2.0
71 stars 18 forks source link

Added libpcap-dev dependency #35

Closed afraisse closed 8 years ago

afraisse commented 8 years ago

The go get command fails because go cannot find libpcap. Apt-getting it after grabbing golang fixes it.

ThanosBaskous commented 8 years ago

đź‘Ť Thanks @afraisse !

afraisse commented 8 years ago

Happy to contribute Thanos !

Aurproxy looks great for my use case (which is deploying applications with inter-dependencies onto Aurora).

I am struggling a bit though and would really value your input, if you have some time :)

I am trying to go through the getting started tutorial (on an Aurora cluster I deployed on AWS), but it seems that the service discovery announcement in ZK falls through. Have you ever heard of a similar issue ?

Here is Aurproxy’s stderr output :

016-05-12 13:55:15,527 [INFO] kazoo.client: Connecting to 172.31.6.48:2181 2016-05-12 13:55:15,537 [INFO] kazoo.client: Zookeeper connection established, state: CONNECTED 2016-05-12 13:55:15,538 [INFO] tellapart.aurproxy.source.sources.serverset.[//aurora//www-data/devel/hello_world]: TellApart ServerSet initializing on path //aurora//www-data/devel/hello_world 2016-05-12 13:55:15,539 [WARNING] tellapart.aurproxy.source.sources.serverset.[//aurora//www-data/devel/hello_world]: Path //aurora//www-data/devel/hello_world does not exist, waiting for it to be created. 2016-05-12 13:55:17,570 [INFO] tellapart.aurproxy.backends.nginx.backend: No update required. 2016-05-12 13:58:17,695 [WARNING] tellapart.aurproxy.source.manager: No endpoints returned by regular sources. 2016-05-12 13:58:17,696 [WARNING] tellapart.aurproxy.source.manager: No endpoints returned by regular sources. 2016-05-12 13:58:17,696 [INFO] tellapart.aurproxy.backends.nginx.backend: No update required.

I digged into Zookeeper’s nodes and indeed couldn’t find any relevant ServerSet.

Thanks for your help.

Adrian

P.S. Also, to have Thermos working properly I had to install libcurl4-nss-dev into the Aurproxy container. Seems like an Aurora issue though.

Le 10 mai 2016 Ă  16:19, Thanos Baskous notifications@github.com a Ă©crit :

đź‘Ť Thanks @afraisse https://github.com/afraisse !

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/tellapart/aurproxy/pull/35#issuecomment-218172009

ThanosBaskous commented 8 years ago

Hi Adrian, Are you trying to point at a job that is based on the example in this project (examples/hello_world.aur) or at yoru own job? If it is your own job, can you confirm that it has named ports (configuration-level rendered command line contains something like "{{thermos.ports[http]}}") and is configured to announce itself into Aurora service discovery (service definition contains something like "announce = Announcer()")?

See https://github.com/tellapart/aurproxy/blob/master/examples/hello_world.aur for an example.

afraisse commented 8 years ago

Hello Thanos,

Thanks for your reply. I am indeed using examples/hello_world.aur. I figured that my problem was related with Aurora configuration and not with Aurproxy, so I asked away on the aurora user list and got this reply :

I didn’t set the —announcer_ensemble flag in Thermos executor - a step that is already done when using the vagrant demo env (https://github.com/apache/aurora/blob/master/examples/vagrant/upstart/aurora-scheduler.conf#L45 https://github.com/apache/aurora/blob/master/examples/vagrant/upstart/aurora-scheduler.conf#L45).

Now everything is working ! :)

Btw, did you also experience any issues regarding libcurl4-nss-dev when running the Aurproxy image ?

Cheers Adrian

Le 12 mai 2016 Ă  19:08, Thanos Baskous notifications@github.com a Ă©crit :

Hi Adrian, Are you trying to point at a job that is based on the example in this project (examples/hello_world.aur) or at yoru own job? If it is your own job, can you confirm that it has named ports (configuration-level rendered command line contains something like "{{thermos.ports[http]}}") and is configured to announce itself into Aurora service discovery (service definition contains something like "announce = Announcer()")?

See https://github.com/tellapart/aurproxy/blob/master/examples/hello_world.aur https://github.com/tellapart/aurproxy/blob/master/examples/hello_world.aur for an example.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/tellapart/aurproxy/pull/35#issuecomment-218822298

ThanosBaskous commented 8 years ago

We have not seen problems with libcurl4-nss-dev. One possibility is that a dependency on it has emerged in the latest version of Aurora (we are currently running one version behind).

ThanosBaskous commented 8 years ago

(Also, glad to hear that you got it working, let me know if you have any further trouble!)

afraisse commented 8 years ago

Hi Thanos,

Thank you for your concern ;) I use Aurora 0.12.0 with Mesos 0.25.0. Maybe the issue with libcurl comes from Mesos itself.

I took some time to dive deeper into Aurproxy before getting back to you. First, I wanted to report an issue with the example aurproxy job from the tutorial :

Because Aurproxy’s service is defined with a « host » constraint, the job scheduling results in status : PENDING • Constraint not satisfied: host. Indeed, in the .aur file : base_aurproxy_service_template = Service( container= Container( docker = Docker( image='{{docker_url}}/{{docker_library}}/{{docker_package}}:{{docker_image_version}}') ), constraints={ 'host': 'limit:1' } )

Unfortunately, the Mesos slave process started within Vagrant doesn’t have any attributes, let alone a host attribute (see http://192.168.33.7:8081/slaves). I didn’t make any pull request because I am unsure whereas you would rather remove the host constraint from the configuration file or ask users to update the slave's start-up flags in the tutorial.

This leads me to a bunch of questions.

For context, my company develops an open-source application management software for cloud infrastructures (http://alien4cloud.github.io). We use TOSCA to create cloud application blueprints (we call them topologies) and are able to deploy them onto multiple cloud providers using cloudify. I would like to use Aurproxy to allow multiple jobs to discover each other (say, for exemple, an apache-powered web-app interacting with a database), in order to be able to define TOSCA-based cloud topologies and deploy them onto an Aurora/Mesos cluster, as seamlessly for the user as if it was deployed onto a IaaS.

First, about scaling : do you use only one instance of Aurproxy or rather have one Aurproxy instance for each job you want to proxy to ? (From my understanding, only one instance should be set up per slave (I believe this is what your host constraint does) so this could quickly ramp up the number of slaves required in the cluster).

Second, about host routing : I wanted to be able to serve different jobs through a single port via Aurproxy. I’ve looked up the nginx.conf.template and noticed that I can set server names and used that in that end (see the .aur file attached). It worked perfectly when setting the Host HTTP header, but is against your recommendation for Proxy server objects : « ports: Required list of one or more ports to listen on. Shouldn't collide with other ports in use » Is this approach best practice in your opinion ?

Thank you for this great piece of software and sorry for the uber-long mail !

Best regards,

Adrian

Le 12 mai 2016 Ă  20:59, Thanos Baskous notifications@github.com a Ă©crit :

(Also, glad to hear that you got it working, let me know if you have any further trouble!)

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/tellapart/aurproxy/pull/35#issuecomment-218853386

ThanosBaskous commented 8 years ago

Hi Adrian,

Sorry for the delay in my response. Thanks for noting the issue - I will test it out when I get a chance and update if necessary.

To answer your questions:

"First, about scaling : do you use only one instance of Aurproxy or rather have one Aurproxy instance for each job you want to proxy to ? (From my understanding, only one instance should be set up per slave (I believe this is what your host constraint does) so this could quickly ramp up the number of slaves required in the cluster)."

We have aurproxy deployed several ways:

  1. For large HTTP services (hundreds of thousands of QPS), we run a cluster of aurproxy nodes per service.
  2. For small HTTP services, we run a small cluster of "common" aurproxy nodes and use HTTP Host Header to route.
  3. For internal proxying of L4 (TCP) services, we run a small cluster of aurproxy nodes with hard-coded, globally distributed ports. Port 9000 -> Service A, Port 9001 -> Service B, etc.

We use Aurora to schedule aurproxy onto a dedicated group of mesos workers. The workers are EC2 instances, and the aurproxy Aurora tasks are sized such that they and only they fit on the EC2 instance. This isn't very elegant, but it has worked for us. If you can find a way to guarantee that your "fixed" ports won't collide, you can run multiple aurproxy instances per host, but it isn't something we've done for any production services.

"Second, about host routing : I wanted to be able to serve different jobs through a single port via Aurproxy. I’ve looked up the nginx.conf.template and noticed that I can set server names and used that in that end (see the .aur file attached). It worked perfectly when setting the Host HTTP header, but is against your recommendation for Proxy server objects : « ports: Required list of one or more ports to listen on. Shouldn't collide with other ports in use » Is this approach best practice in your opinion ?"

As mentioned in (2) above, we run a small aurproxy cluster that does host-level routing for many small jobs.

I hope those answers help! Let me know if I can expand on either.

-Thanos

afraisse commented 8 years ago

Hi Thanos,

Many thanks for your thorough response. I have found your approach quite efficient and I think we can pick one for each of our topologies depending on how much we scale things.

However as supporting Docker is a strategic objective for my company, focus has shifted to integrating Marathon to our product, so Aurora is on hold at the moment…

I will use your answer to write guidelines for our next iteration. We will provide feedback on our Aurproxy usage once it’s in production :)

Thanks again

Best regards, Adrian

Le 25 mai 2016 Ă  01:57, Thanos Baskous notifications@github.com a Ă©crit :

Hi Adrian,

Sorry for the delay in my response. Thanks for noting the issue - I will test it out when I get a chance and update if necessary.

To answer your questions:

"First, about scaling : do you use only one instance of Aurproxy or rather have one Aurproxy instance for each job you want to proxy to ? (From my understanding, only one instance should be set up per slave (I believe this is what your host constraint does) so this could quickly ramp up the number of slaves required in the cluster)."

We have aurproxy deployed several ways:

For large HTTP services (hundreds of thousands of QPS), we run a cluster of aurproxy nodes per service. For small HTTP services, we run a small cluster of "common" aurproxy nodes and use HTTP Host Header to route. For internal proxying of L4 (TCP) services, we run a small cluster of aurproxy nodes with hard-coded, globally distributed ports. Port 9000 -> Service A, Port 9001 -> Service B, etc. We use Aurora to schedule aurproxy onto a dedicated group of mesos workers. The workers are EC2 instances, and the aurproxy Aurora tasks are sized such that they and only they fit on the EC2 instance. This isn't very elegant, but it has worked for us. If you can find a way to guarantee that your "fixed" ports won't collide, you can run multiple aurproxy instances per host, but it isn't something we've done for any production services.

"Second, about host routing : I wanted to be able to serve different jobs through a single port via Aurproxy. I’ve looked up the nginx.conf.template and noticed that I can set server names and used that in that end (see the .aur file attached). It worked perfectly when setting the Host HTTP header, but is against your recommendation for Proxy server objects : « ports: Required list of one or more ports to listen on. Shouldn't collide with other ports in use » Is this approach best practice in your opinion ?"

As mentioned in (2) above, we run a small aurproxy cluster that does host-level routing for many small jobs.

I hope those answers help! Let me know if I can expand on either.

-Thanos

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/tellapart/aurproxy/pull/35#issuecomment-221436772