Closed abhagupta closed 8 years ago
Also, is there a way I can get the test info and page that was tested on from sitespeed results. I see that browsertime produces this data in their info section
"info": {
"browsertime": {
"version": "1.0.0-beta.9"
},
"url": "https://www.sitespeed.io",
"timestamp": "2016-11-03T10:36:17-07:00"
},
but this data is not collected by sitespeed. Anyway I can get hold of url and timestamp?
Hey @abhagupta been home with a sick kid today and need to do some work but I'll get back tonight before I go to bed, sorry for the delay.
No problem.. take it easy!! I might be able to look through sitespeed's code.
First I've updated the plugin documentation yesterday, it still misses things but it explains parts much better than before: https://www.sitespeed.io/documentation/sitespeed.io/plugins/#create-your-own-plugin ping @tobli @beenanner would love to have your input there when you have time!
Yes the structure has changed. I know that is not optimal and means more work, the reason is that the structure in 3.x wasn't sustainable, the new one seems (at least now) that it will hold for years.
"browser": {}, "pageinfo": {}, "timings": {}, "coach":{}
In 4.0 all these metrics are collected using browsertime. We collect the default Javascripts in Browsertime https://github.com/sitespeedio/browsertime/tree/master/browserscripts and then add the Coach javascripts on top (and you can add your own). So in this case, timings are the important ones from Browsertime (like in 3.x).
My question is whether there is a recommended way to get the browsertime data from sitespeed so that I have minimal work in transformation
Does your plugin acts on different messages? Then you can collect it from browsertime. I'm not fully 100% sure how you wanna do this, so lets try to sync.
Also, does it make more sense to use coach metrics now instead of browsertime?
The coach metrics is the more YSlow. We still run some timing metrics inside of the coach but you should use the ones in Browsertime (or rather "timings") because they will be the original ones and will hold stats like median (if you wanna use that) and also have SpeedIndex (still experimental).
Also, is there a way I can get the test info and page that was tested on from sitespeed results. I see that browsertime produces this data in their info section
Do you collect the info from a message? Then you have the URL in the message (message.url). If you use the dataCollection you should look at the HTML plugin to see how you can use that data. I can help you more there later.
Thanks @soulgalore for detailed answer. Makes sense.. Last question. The timings
has one set of data even if I run multiple tests (of one url) in one execution of sitespeed. So are the values inside timings
are median of all runs? Or do I get a choice to get 90%tile as well ?
@abhagupta you can get whatever you want. if you collect data directly from browsertime messages, browsertime sends browsertime.run message that contains all the data for each run and then browsertime.pageSummary that is median/p90 values etc for the runs for that URL. This is from my head, so I need to verify :)
Data from Browsertime has median, mean, and percentiles. One way of checking is just to run Browsertime standalone on a given url (use the version that corresponds to the Sitespeed.io version you use). It's hard to make a general recommendation for how to process data from Sitespeed to put in a database, it all depends on your use case.
For some cases it might be most convenient to write a simple plugin to extract just what you need and process that (e.g. writing to a database without intermediate json files). To see contents of messages you can pick up in plugins, run sitespeed.io with the --debug flag and -v or -vv.
On the other hand, if you just want Browsertime data, you can run Browsertime standalone without sitespeed.
I'm closing this issue now, please open a new issue if you find things that don't seem to work, or hit any limitations. Thanks!
Thanks @tobli you brought up a very good point that I can just send the data directly to database without generating intermediate json files. The only issue with Kairos DB (which we are using) is that it does a HTTP POST call for adding a metric (unlike Graphite). And adding so many metrics produced by sitespeed.io is going to crash the communication between plugin and database. So here are few things I am trying, and I might need your suggestion in this :
./node_modules/.bin/sitespeed.io http://www.example.com -b chrome -n 2 --metrics.filter *- coach.pageSummary.advice.performance.adviceList.*.score *- coach.pageSummary.advice.timings.* *- aggregateassets.*.* *-coach.* *- domains.* --plugins.load plugin.js
but I am still seeing a lot of aggregateassets
metrics. I also do not want coach
right now. But *- coach.*
did not remove the coach metrics.. Any suggestions on what I am doing wrong?
In fact, i just need browsertime
and pagexray
at this time.. If you have option handy for just these 2, could you let me know.
If I can reduce the data to some 100 metrics, doing POST call wouldn't be bad. Please let me know if you have suggestions/alternatives.
Hey @abhagupta the documentation is hard to understand, we need to work on that, sorry! The *- removes all configuration, so you need to only do that ones (first) and then add the rest as you want. I think we can also make the filters easier to understand.
To get only timings from Browsertime and content types (size and request) from pagexray you can run like this:
bin/sitespeed.js --metrics.filter *- browsertime.pageSummary.statistics.timings.* pagexray.pageSummary.contentTypes.* -n 1 https://www.sitespeed.io
Best Peter
A custom plugin can look at as many or as few metric types as it wants (--metrics.filter only applies to the built-in graphite and influxdb plugins).
I simple plugin for posting browsertime and pagexray data can look like this:
'use strict';
const Promise = require('bluebird');
const http = require('http');
function postJson(json) {
const options = {
hostname: 'httpbin.org',
path: '/post',
method: 'POST'
};
return new Promise((resolve, reject) => {
const req = http.request(options, (res) => {
res.once('end', () => resolve());
});
req.once('error', (e) => reject(e));
req.write(json);
req.end();
});
}
module.exports = {
name() {
return 'poster';
},
processMessage(message) {
switch (message.type) {
case 'browsertime.pageSummary': {
const data = {
url: message.url,
type: message.type,
timestamp: message.timestamp,
data: message.data.statistics
};
return postJson(JSON.stringify(data));
}
case 'pagexray.pageSummary':
return postJson(JSON.stringify(message));
default:
// Ignore everything else
}
}
};
Run like this (note that --plugins.load need to be last, for now at least):
sitespeed.io -n1 http://www.sitespeed.io --plugins.load ./poster.js
The code to pick out metrics from the messages can be as simple or complex as you need it to be.
Thanks guys!! this will help a lot.
Hi, I just wanted to confirm before I start the task of transforming sitespeed 4 output of JSON data into our database, whether sitespeed 4 output is entirely chnaged from sitespeed 3. In sitespeed 3, we used to have a data structure like this
In sitespeed, i am seeing something like
I do see
har
but nobrowsertime
metrics.I checked the data produced by browsertime as well, using https://www.sitespeed.io/documentation/browsertime/#a-simple-example, that data structure is also changed a bit, although looks similar. somethign along:
My question is whether there is a recommended way to get the browsertime data from sitespeed so that I have minimal work in transformation. Also, does it make more sense to use
coach
metrics now instead of browsertime?