Open pushcx opened 6 years ago
quick graph of how extreme this distribution is:
To be explicit: I think the opportunity to not do a SELECT, just UPDATE for the current traffic RETURNING new value, push all the work to db, is probably a mandatory element of this. It's OK to drop down to raw queries for such a hot path piece of code that's called for every hit!
I bet if you colored the icon by log10(traffic) instead of just the raw traffic number you'd get a more evenly distributed curve.
I have noticed the traffic graph is almost always the same, between ~97 and 100. Given the distribution, your could decile by frequency such that each bucket has the same number of observations. That will have the effect of putting the outliers in decile 10 and otherwise uniformly distributing the traffic.
Hey all, long time lurker first time contributor. Saw the "good first issue" tag on this and thought it might be fun to take a crack at.
I'm thinking a possible solution would be to compare the traffic in the last arbitrary time period (maybe an hour) to the one before that.
The DB would have 4 keys
current_period_expiration
: Time when the current period expires
current_period_traffic
: Counter for how much traffic is coming in this period
last_period_traffic
: Counter for how much traffic there was last period
traffic_intensity
: Intensity based on the relationship between traffic from current and last period at time of expiration
On each request the server will
Check if current period is expired. If it is it will
1a) compare last_period_traffic
to current_period_traffic
to calculate traffic_intensity
1b) set last_period_traffic
to current_period_traffic
1c) increase the current_period_expiration
by 1 period
Increment current_period_traffic
by 1
Call set_traffic_style
with traffic_intensity
Steps 1 and 2 can be skipped if user agent is a bot or if server is in read only mode. We can also do random sampling for step 2 to reduce the number of writes, so maybe only 1 out of every 100 requests increments the counter.
Things I would need input/help on
current_period_traffic
counter? (is there another reason we may want it to increase every time?)Step 1 is going to need a read lock to avoid races, which is pretty expensive. Why would we compare one hour to the previous? How do we derive a 0-100 traffic_intensity
from two data points, and would that cause a discontinuity at the boundary of the hour?
Random sampling in step 2 is a great idea.
I would rather see a moving or rolling average for traffic instead of quantized times myself. You'd be able to determine which direction it was going, though alone we'd still have the problem of how extreme the distribution is.
Good point about the read lock, it might be too expensive if we have short periods. My thought was that it would just give a quick and dirty figure based on how much traffic increased / decreased from the previous period. It would result in a delay of one period, so maybe the periods should be shorter than an hour.
I guess I should take a step back and ask if you guys already have some sort of metrics monitoring going on? Ideally if you guys had something like Prometheus already running we could just ask it where the current period ranks in relation to previous ones
We don't have Prometheus. The full tech stack is over in the ansible repo: https://github.com/lobsters/lobsters-ansible
...which points to a much better shape of solution, like a cron job that runs every 10m and greps the nginx log for 'yyyy-mm-dd hh:m.:..', doing a single insert.
Awesome, that's more my speed. What's the deal with elastic search? I see it installed via ansible but not finding any references to it in the application. Would it be useful to have nginx logs in ES? If so then a cron could run queries directly against it.
EDIT: Looking at the ansible repo, looks like there was some talk about installing netdata. Which would work great too, is that option still on the table? I wouldn't mind giving it a shot
zenzora your suggestion for Prometheus (or I presume some equivalent tool) is solid, making pushcx's suggestion for running cron an effective means of getting the same information.
You'll find the ElasticSearch code in https://github.com/lobsters/lobsters/pull/579
Adding this middleware has some ongoing maintenance issues that I'm still sorting out. I am anxious to have it done but haven't made the time yet.
Before you can go too far down on this road - I have no idea what prometheus/netdata are and would like to hear the case for adding a moving part to our deployment over Keystore
/grepping logs.
Hey Pushcx,
Prometheus is metrics gathering tool with a time series db, which might be overkill considering that your setup doesn't consist of that much infrastructure. Was only asking if you had something similar already setup.
Netdata is a lightweight application that gives you easy monitoring of servers with integrated dashboards and also contains a timeseries DB and API which we could poll to get the relevant info. The discussion I'm referring to is here:
https://github.com/lobsters/lobsters-ansible/issues/17
I'll hang around the #lobsters IRC if you want to discuss
Took a stab at this in cc7e535. I'm going to leave this issue open a week or so to see if it feels good.
I was in
production.log
this morning as I banned a returning spammer and decided to check on a hunch that's been growing in my mind about the traffic level that determines the color of the site logo (seeApplicationController#increase_traffic_counter
):This is not a useful distribution, we're spending nearly all day at 97-100%. Weekday traffic follows the American workday with pretty sharp divisions as the east coast wakes up and west coast falls asleep. The intensity algorithm doesn't reflect this, let alone account for days where traffic is genuinely higher than other days because we got twitter/yc news sending a flood our way. Maybe even this should only be based on logged-in users, to spare the db hit on every visit? Maybe there's thoughts in the git history?
I haven't thought at all about a better approach, just wanted to toss this up in the hopes someone wants a fun puzzle.