2i2c-org / team-compass

Organizational strategy, structure, policy, and practices across 2i2c.
https://compass.2i2c.org
4 stars 13 forks source link

Define a product team and practice #570

Open choldgraf opened 1 year ago

choldgraf commented 1 year ago

Context

Currently, our engineering team is focused around Site Reliability Engineering. They deploy open source tools into production for the services we run, but they do not focus on building new tools and end-user enhancements. This is because our primary strategic priorities are to build a robust, scalable, reliable, efficient service, using pre-existing open source tools.

However, there are many cases where new development would be impactful for our communities, and some cases where it has been explicitly requested or where we've agreed to it in contracts. We are currently doing this work in an ad-hoc fashion with bits and pieces of the SRE team's time, but this is not sustainable or scalable.

- [ ] Decide on the major questions 2i2c should answer to decide whether to engage in product development
- [ ] Make a decision on a short-term and long-term plan for product development
- [ ] ...define next steps based on the above

Proposal

We should define what a "Product development team" looks like at 2i2c, how it relates to the Site Reliability Engineering team, and how we'll bring new development activities into our practices. This might mean defining different management / reporting structures, new kinds of team roles with different skillsets, and new sustainability models for this team.

Our goal should be to have the team structures, processes, and policies to sustainably and efficiently develop and improve technology that our SRE team can deploy into production.

A few things to think about:

Examples of our or open source projects that need more product management / development

damianavila commented 1 year ago

These are particularly important points where a "rotation" between SRE and the Product teams might expose our engs to different stuff so they are not burnt nor bored about always doing the same stuff and also get an early exposure to Product things so they can efficiently deploy them when they are using the SRE hat.

choldgraf commented 1 year ago

@damianavila I totally agree - here's a nice chapter on team overload in the Google SRE guide.

I think there are two parts to this:

  1. We need to have the capacity to constantly re-invest in our own SRE processes to reduce operational overload. Google's SRE docs have a nice page about this.
  2. We need to have team processes that intentionally move people into and out of a "high stress" aspect of the job. I think a great first step is what you describe: have a development team that can operate more at its own pace, an SRE team that is a bit more "reactive" to outages etc, and a way to move folks back and forth between them.
damianavila commented 1 year ago

From the last comment in https://github.com/2i2c-org/meta/issues/365#issuecomment-1488552566:

The biggest remaining question for 2i2c is: what role do we want to have in Executable Books, given that I am one of its PIs (and the perceived lead of the project).