Closed mtustin-handy closed 9 years ago
Hello,
I have a similar experience after upgrading to 1.5.2. I downgraded back to 1.5.1 for the moment to fix the issue. My guess would be it is somehow related to one of the following changes:
@pascalknapen That's awesome. I didn't realise either of those changes happened. Do you have any plans to try out reverting either of those changes?
mysqlclient
is a fork of mysql-python
so I would not necessarily expect a huge regression there.
Quick question, are you running a single webserver? (We run several of them behind a load balancer which means that regressions for a single server may be harder for us to detect, and 4 workers (= processes) as per the default configuration).
Note that the debug server runs on a single thread, and so might be slower as well. Usually one recommends 2N+1 gunicorn workers for N vCPUs. I have a PR out that exposes the ability to use multithreaded workers. Maybe playing with those options might help?
@artwr I just run the server with airflow webserver
. This preforks 4 workers and one master process. We do use a single server on a single host. It's very slow even with only one person accessing it. It's much faster when accessed directly over an SSH tunnel than when accessed over an AWS load balancer.
@mtustin-handy May I ask why you would use a load balancer if it runs on a single node? It seems that this setup would just increase latency, significantly so if the load balancer ends up being in a different availability zone? If you have chrome, can you take a look at the developer tools on the Network tab when you load the server URL, and post a screenshot of what you see there?
@artwr Like Amazon itself, we use loadbalancers to make webservers available across network security/availability zones.
@mtustin-handy @artwr This seems indeed to be related to the interplay between gunicorn and ELB. This article explains it a bit: https://forums.aws.amazon.com/thread.jspa?messageID=419138. As the reply suggests, using async workers might fix the problem. I'll try to find out tomorrow.
Another short term solution might be to bump the number of workers erronously named threads
in our current config.
Made the changes to support both sync and async workers in gunicorn. Just finished testing it and working great now behind ELB. Will send a pull request later today.
On Tue, Nov 10, 2015 at 10:51 PM, Arthur Wiedmer notifications@github.com wrote:
Another short term solution might be to bump the number of workers erronously named threads in our current config.
— Reply to this email directly or view it on GitHub https://github.com/airbnb/airflow/issues/531#issuecomment-155578019.
That is so awesome. I've been testing out the new scheduling stuff that @mistercrunch merged yesterday.
In the news https://www.handy.com/press: The Hottest Startups of 2014 (Forbes http://www.forbes.com/sites/briansolomon/2014/12/17/the-hottest-startups-of-2014/ ) Handy Hits $1 Million A Week In Bookings (TechCrunch http://techcrunch.com/2014/10/14/handy-hits-1-million-a-week-in-bookings-as-cleaning-economy-consolidates/ ) Handy Raises $50 Million in Series C (Handy Blog http://blog.handy.com/50m/) http://blog.handy.com/50m/
@pascalknapen : Does your PR look anything like https://github.com/airbnb/airflow/pull/610 ?
Closed by #618. Feel free to reopen if needed.
This is probably something weird with my environment setup, but I'm opening this in case anyone has any ideas as to what might cause this.
Recently (like, since yesterday afternoon) the airflow webserver has been very, very slow. This persists no matter how many times I restart. It's particularly slow when visited via an AWS load balancer, moderately slow over an SSH port forward, and also very slow via local curl.
System is not under particularly high load, is able to contact the database server, the database server is responding promptly.
Running on redhat 7.
I'd add more details if I knew what was germane.