utopia-rise / godot-kotlin-jvm

Godot Kotlin JVM Module
MIT License
560 stars 38 forks source link

Automatically collect and track benchmark results over time #633

Open chippmann opened 1 month ago

chippmann commented 1 month ago

Internally we already talked about this a few times in the past and already outlined the first steps; The goal is to track the performance of our binding over time in an automated and reproducible way.

As a reminder; we already have a benchmark project which tests the performance of our binding in key areas compared to typed and untyped gdscript and it already outputs a usable json with key metrics and raw data.

While this is useful, it never was a clear indication of our performance as it was never run on the same machine over time, and had to be executed manually.

Hence we decided the following which this issue should track:

The following has already been done:

function doPost(e) { var jsonData = JSON.parse(e.postData.contents); var timestamp = new Date().toLocaleString(); var spreadsheet = SpreadsheetApp.openById(spreadsheetId);

// Removing all existing charts in the Dashboard var dashboard = spreadsheet.getSheetByName("Dashboard"); if(dashboard) { var charts = dashboard.getCharts(); for(var i=0; i<charts.length; i++) { dashboard.removeChart(charts[i]); } }

for(var benchmark in jsonData['data']) { for(var language in jsonData['data'][benchmark]) { // Naming the sheet as 'benchmark|language' var sheetName = benchmark+"|"+language; var sheet = spreadsheet.getSheetByName(sheetName);

  // If the sheet does not exist, create a new one
  if(!sheet) {
    sheet = spreadsheet.insertSheet(sheetName);
    var header = ['Timestamp', 'avg', 'min', 'max', 'median', 'p05', 'p95'];
    sheet.appendRow(header);
  }

  // Convert JSON data to row data
  var row = [timestamp];
  row.push(jsonData['data'][benchmark][language]['avg']);
  row.push(jsonData['data'][benchmark][language]['min']);
  row.push(jsonData['data'][benchmark][language]['max']);
  row.push(jsonData['data'][benchmark][language]['median']);
  row.push(jsonData['data'][benchmark][language]['p05']);
  row.push(jsonData['data'][benchmark][language]['p95']);

  // Append language data to sheet
  sheet.appendRow(row);
}

}

// Create benchmark comparison line graphs var allSheets = spreadsheet.getSheets(); if(!dashboard) { dashboard = spreadsheet.insertSheet("Dashboard"); }

var benchmarkCharts = {}; var benchmarkChartsSeries = {};

for(var i=0; i<allSheets.length; i++) { var sheetName = allSheets[i].getName(); if(sheetName != "Dashboard"){ var benchmarkName = sheetName.split("|")[0]; var languageName = sheetName.split("|")[1]; var lastRow = allSheets[i].getLastRow();

  // If it's a new benchmark, initialize a new chart builder
  if(!(benchmarkName in benchmarkCharts)) {
    benchmarkChartsSeries[benchmarkName] = [{labelInLegend: languageName}]
    benchmarkCharts[benchmarkName] = dashboard.newChart()
      .asLineChart()
      .setOption('title', 'Performance Trend for ' + benchmarkName)
      .setOption('hAxis.title', 'Time')
      .setOption('vAxis.title', 'Average Score');
  } else {
    benchmarkChartsSeries[benchmarkName].push({labelInLegend: languageName})
  }

  // Adding the range from current sheet excluding header and adding language as a series
  benchmarkCharts[benchmarkName].addRange(allSheets[i].getRange(2, 1, lastRow - 1, 2));
}

}

// Build and insert all benchmark charts var position = 1; for(var benchmark in benchmarkCharts) { // Update position for the new chart. benchmarkCharts[benchmark] .setPosition(position * 20, 1, 0, 0) .setOption('series', benchmarkChartsSeries[benchmark]); var chart = benchmarkCharts[benchmark].build(); dashboard.insertChart(chart); position++; // Update position for the next chart } }



The following needs to be done:
- [ ] Give RDP access to other maintainers for maintenance
- [ ] Add self-hosted runner to utopia-rise organisation
- [ ] Setup workflow to run the benchmarks on changes
- [ ] Setup final spread sheet and app script

The following points can be improved at a later stage:
- Possibly Migrating from AppScript and google sheets to Firebase cloud functions and firestore (or Supabase equivalents)
- Run benchmarks as part of PR pipeline to see possible performance problems as part of the PR process