people-os / support-shift-scheduler

Project for scheduling balena engineers for customer support
Apache License 2.0
34 stars 8 forks source link

Balena support shift scheduler

At balena, we practise support-driven development - you can read more about this philosophy in our Support Driven Development blog post from a few years ago. This means that we don’t outsource our customer support; it’s handled by our own engineers, who work from a wide variety of time zones, and across flexible working hours. We offer customer support 21 hours every weekday, from 6 am to 3 am London time (UTC+1 during daylight saving time, UTC otherwise), every week of the year.

This scheduling project enables us to schedule our engineers to cover our support hours, considering multiple factors, for example avoiding scheduling agents outside of their preferred hours. Hence the goal of the support scheduler: maximising support scheduling fairness and efficiency, while minimising pain. You can find a detailed discussion of the considerations relevant to the scheduler in our blog post titled The unreasonable effectiveness of algorithms in boosting team happiness. You will notice that the soft constraints as defined in ./algo-core/src/veterans.py have been redefined since the blog post, but the underlying principles are still the same.

The current version of the solver also includes the following functionality that was not recorded in the blog post:

The core of the algorithm is a constraint solver, and we currently use the Google OR-tools CP-SAT solver, which is well suited to scheduling optimisation.

Requirements

For local development, you need to Clone or download the repository to your local machine. You will need working installations of:

Then, you need to install the prerequisite modules by executing the following on your command line, from within the project's root directory:

# Install node modules, creating a node_modules folder:
$ npm install
# Install Python modules:
$ poetry install     

For balena team members

You will also need:

For assistance, please contact @AlidaOdendaal @gantonayde, @cmfcruz, or teamOS operations.

For the public

The explanation in this section is just for clarity; if you just want to test the scheduling algorithm, you can skip to the Usage section below.

This project makes use of the Google Sheets API to download input data, and the Google Calendar API to create calendar events, and hence needs Google authentication to be set up. If you'd like to set up a similar Google Cloud Project, you have to create a .env file in the project root directory, with GAPI_SERVICE_ACCOUNT_JWT set to the path to the JSON credentials associated with your Google Service Account.

You would also need to modify the code in ./lib/ and ./helper-scripts/download-and-configure-input.ts to make sure that the correct data is being downloaded from your Google Sheets, and configured correctly for the scheduler.

Usage

In this section, <supportName> indicates the relevant support channel, with currently supported values being balenaio, devOps, productOS and supportAuditing.

1. Configure Google Sheet input

For balena team members

In the Full Team Model Google Sheet:

In the Teamwork Model Google Sheet:

Open the Google Sheet titled <supportName> Support Scheduler Logs, navigate to the sheet name <next-Monday-date>_input, and confirm that the full agent list appears there.

2. Downloading and configuring the algorithm input

For balena team members

From the project root directory, run:

$ npm run download-and-configure-input $startDate $supportName

This script will download the availability of each support agent for this cycle (compiled from working hours, time zones, time-off data, existing calendar appointments and possible opt-outs, and including e-mail addresses, teamwork balances and shift length preferences). It will create a JSON input object for the scheduling algorithm. This JSON object is validated against the json input schema, and then stored in the file ./logs/<startDate>_<supportName>/support-shift-scheduler-input.json .

For the public

Since you do not have access to our private Google Spreadsheets, an example JSON input file has already been created for you, to enable you to do a test run of the algorithm. It is located under ./logs/example/support-shift-scheduler-input.json.

Then, for everyone

The JSON input object thus created has two main properties:

For more detail regarding these options, as well as the rest of the input file structure, see the associated json input schema.

3. Creating input files for onboarding

Manually

If there will be new team members onboarding to support in the week to be scheduled, you have to create the following 2 text files in the ./logs/<startDate>_<supportName>/ folder:

  1. onboarding_agents.txt: A list of Github handles for the onboarding agents.
  2. mentors.txt: A list of Github handles for the onboarding mentors.

In each of the files above, each handle should start with @, and each handle should be on a new line.

Automatically

You can also try to generate the files automatically with the following command:

$ npm run check-for-onboarding $startDate $supportName

This script will look for a Google spreadsheet whose id is specified as the value of onboardingSheet in helper-scripts/options/$supportName.json. Next the script will try to find a tab called $startDate. If it finds one, the data after the first row will be used to generate the mentors.txt and onboarding_agents.txt files from the first and second column, respectively. Note that the handles in the spreadsheet should not contain @, the script will add the prefix automatically.

4. Running the scheduling algorithm

While in the project's root directory on your local machine, run the following on the command line to activate the virtual Python environment:

$ poetry shell

From within the relevant ./logs/<startDate>_<supportName> directory (or ./logs/example if you are using the example data), launch the solver with:

$ python3 ../../algo-core --input support-shift-scheduler-input.json

Upon completion, the algorithm will write the optimised schedule to the file support-shift-scheduler-output.json (after validating against the json output schema).

If the Solution type is OPTIMAL, it means that the solver has determined this to be the solution with the lowest possible cost ("pain") value given the defined parameter space. If the Solution type is FEASIBLE, it means that this solution is the best one the solver could find given the set optimisation timeout.

5. Beautifying the schedule

$ npm run beautify-schedule $startDate $supportName

This script writes a formatted schedule to the file beautified-schedule.txt, which is a helpful view as a sanity check that the schedule is legitimate. The script also writes message text for our internal chat to the files markdown-agents.txt, which prompts the scheduled agents to check their calendars after the Google Calendar invites have been sent, and markdown-onboarding.txt, which alerts the onboarders and mentors to the onboarding shifts.

If, for some reason, the schedule needs to be modified, it should be edited directly in support-shift-scheduler-output.json, after which the beautify-schedule script should be rerun as above to update the text files.

6. Sending the calendar invites

For balena team members

From the project root directory, run:

$ npm run send-calendar-invites $startDate $supportName

to write the finalised schedule to the relevant Google Calendar, sending invites to all the associated agents.

7. Setting the victorops schedule

For devOps team members

From the project root directory, run:

$ npm run set-victorops-schedule $startDate $supportName

to set the scheduled overrides in victorops.

8. Notify the team in Zulip

For balena team members

Built with

This project makes use of:

License

This project is licensed under the terms of the Apache License, Version 2.0.