Average _count per interval (eg minute) metric

n3ziniuka5 commented 9 years ago

I have an index in which every document represents a request. I want a metric that would show requests per minute to measure how busy the site is.

Currently I have a count metric with query @timestamp: [now-1m TO now] which shows the amount of requests received during the last minute.

However, I want to use Kibana's built in duration(top-right corner). So if I select Last 15 minutes, it should get a count of documents indexed during the past 15 minutes and divide that by 15. Similarly, if I select Last 1 hour, it should divide it by 60.

Is it possible to accomplish that with the current stable version of Kibana? If not, do you think it would be a good feature to add?

markwalkom commented 9 years ago

You can do this with a metric visualisation and the average calculation on the field.

n3ziniuka5 commented 9 years ago

@markwalkom on what field should I calculate the average on? I am not sure if you understood my question. I would like a metric that would show documents indexed per minute.

My index mapping:

{
  "url": "string",
  "method": "string",
  "body": "string"
}

So if I selected Last 15 minutes in the upper right corner and there were a total of 100 documents indexed over the past 15 minutes, the metric should show 6.6 (100 / 15)

markwalkom commented 9 years ago

I understand, and as I said there is an average metric you can use to provide this. As to what field you want to calculate that on, that is up to you.

Here's an example of it in a visualisation; screen shot 2015-08-12 at 21 26 33 pm

Then that average is calculated on the timespan you pick.

markwalkom commented 9 years ago

https://www.elastic.co/guide/en/kibana/current/metric-chart.html also has information.

n3ziniuka5 commented 9 years ago

But that is not what I need. I need a Count metric, divided by the amount of minutes in a selected period. That would give me the amount of documents indexed per minute. It is not an average of any field.

@spalger, @rashidkpc, @palecur or @lukasolson, please comment if you understand the issue I am trying to describe. Thanks.

rashidkpc commented 9 years ago

You want the average _count per minute for the selected time period.

I understand what you're trying to accomplish, currently there is no way to accomplish this. This probably falls under "metric math", but you also want it pegged to the time picker, which is a bit more challenging.

rashidkpc commented 9 years ago

Now that I've renamed the title, this probably isn't that hard to implement if it was implemented on the _count metric alone.

Really you'd want a "scale to X" on any _count metric where it takes the number and divide, or multiplies it such that it reflects the _count per X.

pemontto commented 9 years ago

:+1: I would find this very useful, on top of queries this can offer a lot of flexibility

GrahamHannington commented 9 years ago

:thumbsup:

My use case (in brief: "me too"):

I already have a Kibana 4 metric visualization that displays a count aggregation: that is, the number of Elasticsearch documents in the current time range (in the specified index pattern).

In my case, each Elasticsearch document represents a transaction. So the count represents the number of transactions in the time range. That's useful (thanks!). So far, so good.

Now, independent of the time range, I want metric visualizations that display "average count per second" and "average count per day": "average count per [arbitrary interval]" would be nice.

So that I can compare, for example, the average number of transactions per second (TPS) in the last 30 days with the average TPS yesterday (or any other arbitrary time ranges, of same or different duration); same dashboard, same visualization, just changing the time range.

GrahamHannington commented 9 years ago

Similar use case, different visualization: I wanted to create a line chart of transactions per second (TPS), which, as described in my previous comment, is a count of documents (in the specified index pattern) per second.

Aware that angels would be standing back, lighting cigarettes, and placing bets on my likelihood of survival, I created a visualization with the following details:

kibana_tps

with a time range of... 30 days. Yes, understandably: kaboom. My Firefox browser hung, then crashed.

With a much smaller time range - a manageable number of buckets - the chart displays with no problem. But I want to chart TPS across arbitrarily wide time ranges, which is where an "average count per second" would be useful. I understand that, at a time range of a few seconds or less, TPS becomes less useful, but I'm okay with that, as long as this combination of metric and time range doesn't crash my browser (or Kibana).

That was actually my second attempt. On my first attempt - with the same time range of 30 days - I specified an interval of "Second" rather "Auto", and omitted the JSON (which I copied from #4459 ). On that attempt, Kibana displayed an information icon with the tooltip "This interval creates too many buckets...", and adjusted the interval; hence my second attempt that "manually" constrained the interval to 1s.

skundrik commented 9 years ago

ES2.0 pipeline aggregations should be able to help but they are no yet supported on Kibana I think.

SebastiaanKlippert commented 9 years ago

+1 for this, I was just trying the exact same thing and was surpised I could not find a way to do this. This would be a very useful feature.

paulfeaviour commented 9 years ago

+1 for this also.

grycuk commented 9 years ago

Yes I agree I have a similar use case and this would be a very useful feature to have.

lgraf commented 8 years ago

I would love this feature +1!

boupetch commented 8 years ago

+1

routerfixer commented 8 years ago

+1 count / time period in seconds and sum field / time period in seconds would be really useful for lots of data we have, would be even better to have it human readable.

pagenbag commented 8 years ago

+1 (But please ,not only for _count) specifically , sum(bytes) / interval = "bandwidth usage" ... been holding off on my request for "bits formatter".

This is 100% essential for me - which is why I had to hack it into a kibana fork. I simply added a "draw as ratio to interval" option - and when the option is checked , I do some calculations on the data before I plot them. selection_111

The problem I had was when I zoomed in to an interval less than my indexing interval (of 5min) - which then - although mathematically correct , destroyed the expected result. Which is why I had to also add an option to define the minimum (indexing) interval.

Other than that my biggest problem was just carrying the calculated values through to the tooltips.. but thats probably just because I've never used angular before and I didnt really understand how all of the kibana code fit together.

So - if I could do it - it really shouldnt be all that difficult :smile: Having said that - its probably better to have proper scripted/calculated metrics - as long as interval is one of the available fields to calculate with.

SjonHortensius commented 8 years ago

@strahdza is your hacked fork public? I understand the code might be ugly but if it's functional others might be interested in it as well.

RobertLukan commented 8 years ago

Would love this one. I am setting up Netflow collector and I am not being able to have graph for bandwidth usage (bits/s).

luxifr commented 8 years ago

+1 especially with the tag "low fruit"... imho this is a huge regression from kibana 3

venkykuberan commented 8 years ago

In the similar lines, I want to build a table with Transactions Per Sec ( doc. indexed per sec). , Transaction names .. I am running a load test with different transactions and all going to ES along with the fields test_id, transactionName and responsetimes. I need to find the start and end time of the test based on the test_id and calculate TPS for each transactions. Note:- I am able to plot the line chart for TPS, stuck with Data Table.

vijaydodla commented 8 years ago

In the same line if we can get way to add some math using other scripted field will boost the power of Kibana . Example for a 15 mnt interval we might need to get average count per interval and then multiply the outcome with a scripted field will help us lot . Here is my math across set of documents based on 15 mnt time interval. *[(minutes occupied)/15](parking spaces occupied/total parking spaces in the group)** I'm trying this is Kibana and no success yet . I have each document which has minutes occupied ,parking spaces occupied and total parking spaces in a group . In each document i get these details and i'm trying to get occupancy based on time interval .

nedmax commented 8 years ago

+1

darioatbashton commented 8 years ago

+1

venkykuberan commented 8 years ago

As a work around i was able to calculate the TPS from a Python script using elasticsearch-dsl-py libraries and post back the metric to ES to show up in kibana .

Ry-K commented 8 years ago

+1 Would like to be able to implement the "special" _count path for buckets_path as documented here: https://goo.gl/zDzZWX

This is under pipeline aggs, so I'm not sure this is supported (yet) but hopefully is when that feature is worked: https://goo.gl/9gmfdo

cdahlqvist commented 8 years ago

@rashidkpc implemented this type of functionality in a function for for Timelion during Elastic{ON}: https://github.com/elastic/timelion/blob/master/series_functions/scale_interval.js

It would be very useful to also have this for Kibana.

darioatbashton commented 8 years ago

I'm also doing this with timelion, but I'd like to have a single metric value with the whole average in Kibana.

kjelle commented 8 years ago

+1

rschmidtke commented 8 years ago

+1

vnandha commented 8 years ago

+1

samuraiii commented 8 years ago

Count also mine +1

Eilyre commented 8 years ago

Hello,

Having similar problem. I'm trying to set up an ELK stack for three different HPC clusters. Each have different amount of nodes, but the total is about 350. And my idea was to track the total CPU usage of every cluster, so I have to sum the load.load1 fields of every node, grouped to different buckets by the hostname.

Each node has Topbeat running on it, reporting the stats back every 1 minute, so by setting timestamp per minute, I did achieve what I wanted.

The problem is with higher timestamps, as then the Topbeats manage to send the information several times, which Kibana happily sums up. But forcing it to use "per minute" timestamp with json on even a daily graph will make it quite unreadable.

So +1 for the feature.