Closed frenchbread closed 7 years ago
Please 👀
@Nazarah Can you comment from the UX point of view?
@frenchbread can you give us some clearly defined metrics here? E.g. how many records is the dashboard loading? How much time (in milliseconds) do the following stages take (basically, what does the Chrome timeline contain)?
Also, what do the performance metrics look like with 10x records? 100x? A rough estimate is fine, so we can get an idea of how this would scale.
The server side request to ES takes around 43-63
milliseconds.
On client side, the total page load takes around 2-2.5
seconds, and the actual rendering around 300
milliseconds.
Excellent, and how many records are you retrieving? That way we can figure out the relationship between the number of records and performance time.
As of now the dataset that goes to the charts looks like this:
So it does not really matter how much items are there since they are not grouped "manually" (by dc.js) but are pre-aggregated. The only load I can think of can come from selecting the maximum possible date range, e.g. years, decades? In that case the list of records will grow. Right now, by default, around 30 days (depending on a month) are shown.
I can go ahead & populate ES with more sample data, say for last 2+ years, and compare the results.
Cool, yes. Keep in mind that the aggregation step takes time too.
@brylie
Keep in mind that the aggregation step takes time too.
This is what "server-side" metric measures.
Here is a range for all the data in the ES (1st Jan - 3rd Mar): 3069 items total. And it also takes from 40-60
ms to request data.
[{"key":"MQT connections over time","values":[{"x":1483308000000,"y":33},{"x":1483394400000,"y":107},{"x":1483480800000,"y":116},{"x":1483567200000,"y":46},{"x":1483653600000,"y":35},{"x":1483740000000,"y":72},{"x":1483826400000,"y":88},{"x":1483912800000,"y":68},{"x":1483999200000,"y":32},{"x":1484085600000,"y":48},{"x":1484172000000,"y":115},{"x":1484258400000,"y":47},{"x":1484344800000,"y":88},{"x":1484431200000,"y":20},{"x":1484517600000,"y":108},{"x":1484604000000,"y":41},{"x":1484690400000,"y":52},{"x":1484776800000,"y":101},{"x":1484863200000,"y":60},{"x":1484949600000,"y":96},{"x":1485036000000,"y":65},{"x":1485122400000,"y":42},{"x":1485208800000,"y":118},{"x":1485295200000,"y":106},{"x":1485381600000,"y":50},{"x":1485468000000,"y":92},{"x":1485554400000,"y":101},{"x":1485640800000,"y":93},{"x":1485727200000,"y":94},{"x":1485813600000,"y":43},{"x":1485900000000,"y":116},{"x":1485986400000,"y":24},{"x":1486072800000,"y":105},{"x":1486159200000,"y":80},{"x":1486245600000,"y":29},{"x":1486332000000,"y":72},{"x":1486418400000,"y":52},{"x":1486504800000,"y":116},{"x":1486591200000,"y":40},{"x":1486677600000,"y":67},{"x":1486764000000,"y":27},{"x":1486850400000,"y":86},{"x":1486936800000,"y":67},{"x":1487023200000,"y":0},{"x":1487109600000,"y":0},{"x":1487196000000,"y":0},{"x":1487282400000,"y":0},{"x":1487368800000,"y":0},{"x":1487455200000,"y":0},{"x":1487541600000,"y":0},{"x":1487628000000,"y":0},{"x":1487714400000,"y":0},{"x":1487800800000,"y":0},{"x":1487887200000,"y":0},{"x":1487973600000,"y":0},{"x":1488060000000,"y":0},{"x":1488146400000,"y":0},{"x":1488232800000,"y":0},{"x":1488319200000,"y":0},{"x":1488405600000,"y":0},{"x":1488492000000,"y":11}]}]
Here is a revised diagram:
From your understanding, what is the point I am trying to make here?
Lets conduct a couple more server-side aggregation experiments, as mentioned above:
That way we can get more insight into how the aggregation step might vary as a function of the number of records.
From your understanding, what is the point I am trying to make here?
@brylie Isn't it "measuring time aggregation takes"? Since aggregation is done on elasticsearch side, search execution metric is exposed in the response object (3
milliseconds in this case):
I've modified the server code and moved ElasticSearch client initialisation out of the Meteor method, this way it saves around 20
milliseconds more.
Since aggregation is done on elasticsearch side, search execution metric is exposed in the response object
Thanks for the example. 3ms aggregation with 3,000 records is quite fast. Can we try a couple of experiments with larger data, say 10,000 or 50,000 record aggregation?
@brylie I'm on it. 🚀
Generated sample data & tested request/aggregation time, collected metrics & saved to docs/request-aggregation-metrics.csv
As seen in the CSV file, growing amount of analytics data for fixed amount of days (in this case 66 days ~ 2 months) does not slow down request/aggregation time. In fact, referring to numbers speed is even increasing:
Items | Server -> ES request (ms) | ES search execution (ms) |
---|---|---|
13k | 42.9 | 3.9 |
20k | 30 | 3.7 |
50k | 28.7 | 2.3 |
As of now, for this PR provided metrics is enough I suppose. I've created related issue #8 for testing within wider date ranges (larger amount of days).
@frenchbread : great work done. I am really liking the simple look of chart here. A few improvement suggestions:
A few queries
To me the client side monitoring visualization graph is looking more attention grabbing and interesting https://cloud.githubusercontent.com/assets/2122679/23654223/5d310890-0338-11e7-8aec-5ba59a9ca50e.png Can we somehow incorporate this sort of visualizations with EMQTT?
@Nazarah Thanks for you comments. I've updated def. of done.
Are we considering any hourly/daily/weekly visualization of data here?
Yes, it is on the way.
From EMQTT schema, what metrics we wold be considering to show in visualization as part of API usage monitoring?
That is still not decided.
Can we somehow incorporate this sort of visualizations with EMQTT?
That's good idea. What exactly are you referring to?
We could actually divide the current (general overview) chart that we have into multiple smaller ones where each chart would represent usage for specific log type (e.g. on_client_connected
, on_client_subscribed
etc.) and color them differently.
How does that sound? @bajiat @brylie @Nazarah
From EMQTT schema, what metrics we wold be considering to show in visualization as part of API usage monitoring?
@frenchbread can you give us some examples of the types of metrics we might expect? E.g. the name of one or more metric(s) and the data type(s). That way we can at least start sketching some ideas.
Added granularity filter:
This is what I meant by
We could actually devide the current (general overview) chart that we have into multiple smaller ones where each chart would represent usage for specific log type (e.g. on_client_connected, on_client_subscribed etc.) and color them differently.
Are we going to keep this (& I commit changes here)?
There are some drawbacks here:
hour
- dashboard freezesIn other cases works fine
Why does the dashboard freeze with large date ranges?
I assume it happens due rendering data for large amount of days (> 600 days).
Idea
: we can create use cases for filter queries when dashboard freezes, hide chart and show placeholder message e.g. "Too much data to show" or "Select smaller dates range"..
Another idea is to have 'default' granularity settings. E.g. when selecting two years span, defaulting the query to monthly (or daily).
Removed grid lines & changed datepicker to pikaday
library.
log type
filter to drop down menuIs this ready for review, or will it be under construction a bit longer?
@brylie This is not yet ready.
Moved filtering by message type to its own select-picker.
Please review.
Woohoo! 😀
@brylie Could you merge this? If you are going to test it, I can send you ES host with sample data.
This looks really good from a UI perspective! I made some suggestions on how to improve the code style, so our code conforms to our developer expectations.
Was this checked with live ES data? If not, @frenchbread let's do a quick check with live data once. First thing tomorrow morning, unless you are occupied. We exchanged docs on event format but still.
Added suggested changes. Please review
After new elasticsearch plugin for emq is deployed, it would require to refactor a little bit in order to fit updated analytics logs schema. But this could be done in separate PR
@phanimahesh As we discussed in the chat, yes, dashboard has been tested with live/real ES data. For purposes of testing (with wider date-ranges & collecting metrics of loading/rendering) some dummy data was added in addition to 'real' generated by the plugin.
Added comments
Removed second meteor method. Please :mag_right:
Changes
How it looks
How fast it loads
TODOs
Closes #6