Closed muttmuure closed 7 months ago
@rinej started working on this today - he'll investigate the ways we can utilise json outputs from Reassure to showcase historical data and how we can pipe this through to eg. https://grafana.com/
@mountiny I think we can assign @rinej to this one 🙂
Grafana would be ideal as we use that already!
@mountiny Hello, today I started investigating the approach, I plan to use grafana
📣 @rinej! 📣 Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork. Please follow these steps:
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>
Triggered auto assignment to @stephanieelliott (NewFeature
), see https://stackoverflowteams.com/c/expensify/questions/14418#:~:text=BugZero%20process%20steps%20for%20feature%20requests for more details.
Hello mountiny,
We created an example approach using the JSON REST API Grafana plugin -> https://grafana.com/grafana/plugins/marcusolsson-json-datasource/
Here is the example dashboard:
In that scenario, we created a simple node server that exposed the endpoints with JSON data, which Grafana consumed. It has some limitations; we will have to maintain a separate instance of the server to deliver data.
We are exploring the other two approaches:
This will not require any additional server, just, for example, an S3 bucket to store the data. I will let you know when we have some insights about the approaches.
Meanwhile, do you have any preferences over what needs to be displayed and shown on the graph? I can send you the example output file from Reassure; just let me know! :)
Thank you @rinej!
Storing just the JSON file on the S3 bucket and updating it when the user runs tests.
I thin kthis sounds good to me, we already have solutions where we save files to S3 bucket from the GH actions so it shoudl also be reasonable easy to achieve
Meanwhile, do you have any preferences over what needs to be displayed and shown on the graph?
I was imagining that we would output the results of main branch over time, when the results get worse we can see based on timestamp what PR caused it, but we could just keep data about the results on main for various test scenarios and see how they evolved overtime
Thanks mountiny you for your answers! We created some POC using JSON rest API grafana plugin
JSON api plugin: https://grafana.com/grafana/plugins/marcusolsson-json-datasource/ Plugin needs one endpoint which returns the data
Data used for the POC response (3 test cases run on different dates on different branches):
{"name":"should render Composer with text input interactions","type":"render", "meanDuration":18.244273030757904, "creationDate":"2023-11-17T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"some-branch" },
{"name":"should press add attachemnt button","type":"render", "meanDuration":24.91762231886387, "creationDate":"2023-11-17T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"some-branch" },
{"name":"should press add emoji button","type":"render", "meanDuration":12.9931875705718993, "creationDate":"2023-11-17T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"some-branch" },
{"name":"should render Composer with text input interactions","type":"render", "meanDuration":13.244273030757904, "creationDate":"2023-11-15T12:45:04.146Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"branch123" },
{"name":"should press add attachemnt button","type":"render", "meanDuration":16.91762231886387, "creationDate":"2023-11-15T12:45:04.146Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"branch123" },
{"name":"should press add emoji button","type":"render", "meanDuration":2.9931875705718993, "creationDate":"2023-11-15T12:45:04.146Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"branch123" },
{"name":"should render Composer with text input interactions","type":"render", "meanDuration":14.244273030757904, "creationDate":"2023-11-16T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"test-branch" },
{"name":"should press add attachemnt button","type":"render", "meanDuration":20.91762231886387, "creationDate":"2023-11-16T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"test-branch" },
{"name":"should press add emoji button","type":"render", "meanDuration":5.9931875705718993, "creationDate":"2023-11-16T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"test-branch" }
Example dashboards:
A - table view - we can choose from a dropdown which test data we want to display B - Tests grouped by name on a timeline - we can preview additional information about each point C - Table view for all test data D - Tests grouped by name on one graph
I think dashboards A and B will be the most useful, so we can see on the timeline when performance has dropped
🎯 What needs to be done:
❓Questions to mountiny?
@rinej Looks very promising, thanks! I will discuss with out infra team who could help us with the implementation as I do not have access to edit S3 or add this plugin to Graphana if we do not have it yet
Triggered auto assignment to @johnmlee101 (ring0
), see https://stackoverflow.com/c/expensify/questions/6102 for more details.
@mountiny I don't have access to the above link or the link is broken, could you give me the access?
Sorry cannot give you access to that one. It was Thanksgiving week so @johnmlee101 will be back this week and he is going to be able to help us with this I am sure
Upload output.json file to S3 bucket
Okay might need to figure out some of these details. I think S3 might be overkill if we just need to store and fetch. Going to check one last thing with Infra
@mountiny @rinej https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts
@justinpersaud brought up using artifacts if we don't care about the data lasting for too long. We have a max 90 day policy.
As for the REST endpoint if we can easily find a way to list all artifacts this way that would be ideal
@johnmlee101 @mountiny you mentioned that you already use Grafana for some metrics, what kind of data source do you use? Do you use some service like https://prometheus.io/ recommended from Grafana? Maybe adding integration to the existing flow would be more efficient, instead of creating something new (like S3)?
Do you use Grafana cloud service or standalone app?
@rinej we do use S3 already, but I think in this care @johnmlee101 is suggesting to use Github artifacts to store the results json for 90 days which is fine. Then we can show those in Grafana just fine too.
@johnmlee101 With that approach, we still need to create the endpoint that retrieves all the artifacts from GitHub and prepares the data for Grafana. Do you have some infrastructure that we can use?
Probably, we can use that API to get artifacts: GitHub Artifacts API.
@mountiny, do you currently use S3 to connect to Grafana, or do you use other plugins for Grafana? I am asking because if you already have some data storage for Grafana, maybe we can use it for the dashboard as well.
I am not sure what we use for Grafana now, I would err on @johnmlee101 opinion here as I am not that familiar with this. Sorry for not being of better help
Thank for the feedback @mountiny!
@johnmlee101 are you able to check what currently Expensify is using to connect to Grafana?
We do not use S3 to connect to Grafana today, instead we use ElasticSearch & Graphite.
What could also be easier is allowing for authenticating GH Actions to hit our graphite endpoint somehow to upload this specific data without the need for formatting JSON. We can do it all from the workflow
That would be great, so we can reuse existing architecture instead of creating new.
Do you know if we can store the json data in graphine? So we don't need to save it as a artifacts on GH?
So the flow would be: run reassure test in CI -> upload output to graphine -> connect graphine with grafana to display graphs, what do you think?
Do you have some example account in graphine so we can investigate how the data is stored there?
Hmm, does it have to be uploaded as JSON? I wonder if we can interact directly with graphite. https://graphiteapp.org/quick-start-guides/feeding-metrics.html this is how metrics are uploaded
Thanks, we will create some proposal with using Graphite
Hello @johnmlee101,
I created some POC and have some questions:
When using graphite, we assume we will use plaintext protocol to upload reassure data to graphite db -> https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol Do you know how can we authenticate that request using nc? Do we have some examples of how do you make authorized calls to the graphite db?
Do you have any suggestion on the structure on the metric keys in the graphite? As graphite stores data as value/key pairs, we might consider the structure like that: reassure.{branchname}.{testcasename}: value
Also what do we want to track as a metric for our case? I would go with meanDuration
- It is calculated by adding up all the values in perf time and then dividing the sum by the number of values.
Here is example data from reassure current.perf
:
{"name":"should render Composer with text input interactions","type":"render", "meanDuration":18.244273030757904, "creationDate":"2023-11-17T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"some-branch" },
{"name":"should press add attachemnt button","type":"render", "meanDuration":24.91762231886387, "creationDate":"2023-11-17T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"some-branch" },
{"name":"should press add emoji button","type":"render", "meanDuration":12.9931875705718993, "creationDate":"2023-11-17T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"some-branch" },
{"name":"should render Composer with text input interactions","type":"render", "meanDuration":13.244273030757904, "creationDate":"2023-11-15T12:45:04.146Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"branch123" },
{"name":"should press add attachemnt button","type":"render", "meanDuration":16.91762231886387, "creationDate":"2023-11-15T12:45:04.146Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"branch123" },
{"name":"should press add emoji button","type":"render", "meanDuration":2.9931875705718993, "creationDate":"2023-11-15T12:45:04.146Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"branch123" },
{"name":"should render Composer with text input interactions","type":"render", "meanDuration":14.244273030757904, "creationDate":"2023-11-16T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"test-branch" },
{"name":"should press add attachemnt button","type":"render", "meanDuration":20.91762231886387, "creationDate":"2023-11-16T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"test-branch" },
{"name":"should press add emoji button","type":"render", "meanDuration":5.9931875705718993, "creationDate":"2023-11-16T14:49:26.957Z", "commitHash":"137a4c761ee686bd6e094ec025b0dcc3f5f454e1", "branch":"test-branch" }
We created a simple POC using local graphite, in that we can display data from graphite, for the single test cases in the timeseries graph:
So the flow for that approach would be:
Open the current.perf
file
Grep it or jq or sed or awk to get data we need (prepare key/value pairs, e.g. reassure.branch1.shouldPressAddEmojiButton: 20.9176)
We will make call using nc to graphite
echo -e "reassure.branch1.shouldPressAddEmojiButton 20.9176 $TIMESTAMP\nreassure.branch1.shouldPressAddAttachementButton 5.92 $TIMESTAMP | nc {GRAPHITE_SERVER} {GRAPHITE_PORT}
Display the data in Grafana using Graphite plugin
Thanks for answering the questions!
Hello @johnmlee101, have you got a chance to have a look on the proposal and the questions?
Waiting for @johnmlee101 to pick this up in new year
Hey @johnmlee101 - looks like the team is awaiting your input here. Is there a chance you'll be available for this in the next few days to help push this forward? :)
@johnmlee101 I know you have been busy with some other stuff, would you prefer to reassign this issue? we are very eager to get this dashboard going
John is deployer this week I think, so his hands are full - hopefully we can pick this up next week
Hey @johnmlee101 are you able to pick this back up next week, or should we reassign?
Bump @johnmlee101, do you have an idea of when you can pick this back up?
Sorry for not getting back to this. I've been staring at the overdues and I should have starred myself here.
Let me ask and see what external credentials we need to authorize these calls
Okay if we pin the runner to a certain IP we can whitelist it for this case 👍
Hello @mountiny @johnmlee101, thanks for the answers! Here is our update:
We created example Dashboard with chart for two values: meanDuration
and meanCount
. Where each line is a single TestCase, when hovering over the point we can which branch it is:
For the POC we used Graphite in the docker container and local Grafana, installed using homebrew:
Here is the proposed solution for structuring the data:
{BUCKET_NAME}.reassure.{METRIC_NAME}.{BRANCH}.{TEST_NAME}
BUCKET_NAME
- main global bucket
METRIC_NAME
- name of the tracking metric (e.g. meanDuration and meanCount)
BRANCH
- name of the branch
TEST_NAME
- name of the specific test
Here is the example path to the data:
"bucket1.reassure.meanCount.branch9.[SearchPage]-should-interact-when-text-input-changes"
To feed the data as recommended we use graphite plaintext protocol: https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol
Example bash script used for POC:
#!/bin/bash
BRANCH="branch4"
TEST_NAME1="[SearchPage]-should-interact-when-text-input-changes"
TEST_NAME2="[SearchPage]-should-render-options-list"
TEST_NAME3="[SearchPage]-should-search-in-options-list"
# Metric data
METRIC_PATH1="bucket1.reassure.meanDuration.${BRANCH}.${TEST_NAME1}"
METRIC_VALUE1=55.6
METRIC_PATH2="bucket1.reassure.meanDuration.${BRANCH}.${TEST_NAME2}"
METRIC_VALUE2=68.92
METRIC_PATH3="bucket1.reassure..meanDuration.${BRANCH}.${TEST_NAME3}"
METRIC_VALUE3=46.84
METRIC_PATH4="bucket1.reassure.meanCount.${BRANCH}.${TEST_NAME1}"
METRIC_VALUE4=4
METRIC_PATH5="bucket1.reassure.meanCount.${BRANCH}.${TEST_NAME2}"
METRIC_VALUE5=6
METRIC_PATH6="bucket1.reassure.meanCount.${BRANCH}.${TEST_NAME3}"
METRIC_VALUE6=4
echo -e "$METRIC_PATH1 $METRIC_VALUE1 $TIMESTAMP\n$METRIC_PATH2 $METRIC_VALUE2 $TIMESTAMP\n$METRIC_PATH3 $METRIC_VALUE3 $TIMESTAMP
$METRIC_PATH4 $METRIC_VALUE4 $TIMESTAMP\n$METRIC_PATH5 $METRIC_VALUE5 $TIMESTAMP\n$METRIC_PATH6 $METRIC_VALUE6 $TIMESTAMP" | nc localhost 2003
In the real version the final string will be taken from reassure output file in the script we add in the github action
To integrate Grafana, we utilized the Graphite plugin, which can be found at https://grafana.com/grafana/plugins/graphite/. This plugin facilitates the connection to Graphite in the Docker container To display the data grouped by test name, we used the following query:
groupByNode(bucket1.reassure.meanDuration.*.*, 4)
We grouped the data by the 4th node, in our case it is the TEST_NAME
node.
One of the challenges was how to display not only the test name but also the branch from which the test was run. To achieve this, we can add another query and group it by the branch name. In our case, the branch name is node number 3.
groupByNode(bucket1.reassure.meanDuration.*.*, 3)
That solution is not perfect because it adds additional data to the chart. It may be worth investigating it further in the real Grafan environment. But the metadata about test name, branch, metric and values will be stored in the Graphite.
Here is a short video demonstrating how to display the branch on the chart:
https://github.com/Expensify/App/assets/2650904/723c9370-88c2-48d9-aa03-87fc19b5594f
Next steps we are working on:
@mountiny @johnmlee101 I am adding some proposal for parsing and sending reassure data to graphite:
In order to parse reassure output.json
and send it to graphite server we need to create:
getGraphiteString.js
send-reassure-to-graphite
getGraphiteString.js
const core = require("@actions/core");
const fs = require("fs");
const run = () => {
// Prefix path to the graphite metric
const GRAPHITE_PATH = "bucket1.reassure";
const regressionOutput = JSON.parse(
fs.readFileSync(".reassure/output.json", "utf8")
);
const commitHash = regressionOutput.metadata.current.commitHash;
const creationDate = regressionOutput.metadata.current.creationDate;
const timestampInMili = new Date(creationDate).getTime();
// Graphite accepts timestamp in seconds
const timestamp = Math.floor(timestampInMili / 1000);
// We need to combine all tests from the 4 buckets
const reassureTests = [
...regressionOutput.meaningless,
...regressionOutput.significant,
...regressionOutput.countChanged,
...regressionOutput.added,
];
// Map through every test and create string for meanDuration and meanCount
const graphiteString = reassureTests
.map((test) => {
const current = test.current;
// Graphite doesn't accept metrics name with space, we replace spaces with "-"
const formattedName = current.name.split(" ").join("-");
const meanDurationString = `${GRAPHITE_PATH}.meanDuration.hash-${commitHash}.${formattedName} ${current.meanDuration} ${timestamp}`;
const meanCountString = `${GRAPHITE_PATH}.meanCount.hash-${commitHash}.${formattedName} ${current.meanCount} ${timestamp}`;
return `${meanDurationString}\n${meanCountString}`;
})
.join("\n");
// Set generated graphite string to the github variable
core.setOutput("graphiteString", graphiteString);
};
if (require.main === module) {
run();
}
module.exports = run;
send-reassure-to-graphite
name: send-reassure-to-graphite
on:
push:
branches: [main]
jobs:
testJob:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "20"
# In final version for setting up node we will use the created action from the repo:
# - name: Setup NodeJS
# uses: ./.github/actions/composite/setupNode
- name: Install dependencies
run: npm install
- name: Run JavaScript file
id: saveGraphiteString
run: node ./.github/actions/javascript/getGraphiteString.js
- name: Send graphite data
run: echo -e "${{ steps.saveGraphiteString.outputs.graphiteString }}" | nc $GRAFHITE_SERVER $GRAFHITE_PORT
In the GH action send-reassure-to-graphite
, we call the JS function getGraphiteString
. The function is responsible for parsing output.json
and setting the graphiteString
variable with the all necessary data.
In the next step we make the nc
protocol call to the graphite server.
The $GRAFHITE_SERVER
$GRAFHITE_PORT
will need to be set in the github env variables.
@mountiny In that approach we are saving data by commitHash
, we are investigating the way to save by PR number. I think we might take it in the github action
Some concerns:
nc
graphite call bandwidth, when we would have many more reassure test some data might be lostcc @adhorodyski
@mountiny @johnmlee101 Here is the final version of scripts, as you requested with that we will have access to PR number so we can save it in the graphite string. Also we had to decide when to run the script, ideally just once after the PR is merged to master.
Please let us know what you think, I am going to start working on draft PR.
output.json
on separate branches every time we create a PR or modify it.pull_request
eventUnder these assumptions, we propose that instead of creating a new GH action to be triggered on the main
, we will modify the current reassurePerformanceTests
. We need to make sure that the steps responsible for making call to graphite are triggered only after merging to the main.
Pros of this approach:
output.json
, because we are in the same job all the timeprNumber
, which means we can save this data in graphitereassurePerformanceTests.yml
.Here are the the steps we will add to the end of reassurePerformanceTest.yaml
- name: Run JavaScript file
# run only when merged to master
if: github.event.pull_request.merged == true && github.event.pull_request.base.ref == 'main'
id: saveGraphiteString
run: node ./.github/actions/javascript/getGraphiteString.js
- name: Send graphite data
# run only when merged to master
if: github.event.pull_request.merged == true && github.event.pull_request.base.ref == 'main'
run: echo -e "${{ steps.saveGraphiteString.outputs.graphiteString }}" | nc $GRAFHITE_SERVER $GRAFHITE_PORT
Here is the modified getGraphiteString.js
:
const core = require("@actions/core");
const github = require('@actions/github');
const fs = require("fs");
const run = () => {
// Prefix path to the graphite metric
const GRAPHITE_PATH = "bucket1.reassure";
const regressionOutput = JSON.parse(
fs.readFileSync(".reassure/output.json", "utf8")
);
const creationDate = regressionOutput.metadata.current.creationDate;
const timestampInMili = new Date(creationDate).getTime();
// Graphite accepts timestamp in seconds
const timestamp = Math.floor(timestampInMili / 1000);
// get PR number from the github context
const prNumber = github.context.payload.pull_request.number;
// We need to combine all tests from the 4 buckets
const reassureTests = [
...regressionOutput.meaningless,
...regressionOutput.significant,
...regressionOutput.countChanged,
...regressionOutput.added,
];
// Map through every test and create string for meanDuration and meanCount
const graphiteString = reassureTests
.map((test) => {
const current = test.current;
// Graphite doesn't accept metrics name with space, we replace spaces with "-"
const formattedName = current.name.split(" ").join("-");
const meanDurationString = `${GRAPHITE_PATH}.meanDuration.PR-${prNumber}.${formattedName} ${current.meanDuration} ${timestamp}`;
const meanCountString = `${GRAPHITE_PATH}.meanCount.PR-${prNumber}.${formattedName} ${current.meanCount} ${timestamp}`;
return `${meanDurationString}\n${meanCountString}`;
})
.join("\n");
// Set generated graphite string to the github variable
core.setOutput("graphiteString", graphiteString);
};
if (require.main === module) {
run();
}
module.exports = run;
so now instead of using commitHash or branch name, we save the PR name in the graphite path.
Hey @johnmlee101 @mountiny just a reminder to check out the above when you get a chance!
@johnmlee101 @mountiny
I created the PR with the GH action which parses the output.json from reassure tests and make call to graphite server. Please have a look - https://github.com/Expensify/App/pull/36480
Also we will need to set the environmental variables for $GRAPHITE_SERVER
$GRAPHITE_PORT
.
Do you have some suggestions how can we test the flow before merging to main?
thank you! posting in slack too
Latest question for @johnmlee101 is here: https://expensify.slack.com/archives/C05LX9D6E07/p1708601197525589?thread_ts=1707855242.234049&cid=C05LX9D6E07
Grabbing the latest from Slack:
added the fix, the checks passes. Thanks @john! So I think when all backend changes will be there we can merge the test PR and see if it works :) -> https://github.com/Expensify/App/pull/37019
Reviewing
label has been removed, please complete the "BugZero Checklist".
The solution for this issue has been :rocket: deployed to production :rocket: in version 1.4.47-10 and is now subject to a 7-day regression period :calendar:. Here is the list of pull requests that resolve this issue:
If no regressions arise, payment will be issued on 2024-03-13. :confetti_ball:
For reference, here are some details about the assignees on this issue:
BugZero Checklist: The PR adding this new feature has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:
Skipping the payment summary for this issue since all the assignees are employees or vendors. If this is incorrect, please manually add the payment summary SO.
Reviewing
label has been removed, please complete the "BugZero Checklist".
The goal of the #newdot-performance project is achieving, maintaining (and protecting) these benchmarks for actions taken in app:
TTI: 3 seconds Chat switching: 500ms Sending a message: 100ms Switching Active Workspace: 300ms
We have made good progress, in that the app is stable and performant on an anecdotal basis, and the wider company can focus on other areas of the roadmap. We would like to solidify this feeling with hard evidence and protect the app from performance regressions.
We would like to begin formally reporting on the performance benchmarks so we know they are being maintained. And we will use this data to protect the app’s benchmarked performance by “time-stamping” when performance regressions originate.