w3ctag / process

3 stars 5 forks source link

Scaling design reviews #34

Open LeaVerou opened 8 months ago

LeaVerou commented 8 months ago

This came out of a conversation I had with a couple OpenJS folks, which started from this statement by @michaelchampion:

IMHO the TAG created its own scalability problem by switching from general architectural guidance / Findings to reviews of individual specs along with very abstract work on things like the Privacy Principles / Ethical Web Principles.

It is very clear that right now, we have a process problem. We simply cannot work through the volume of design reviews we are receiving in a timely manner and quality has been slipping.

Here are some issue closing stats for the design-reviews repo:

image

Each design review takes months to finish. Months! Some even take close to a year or more (the so-called abyss).

On the other hand, we don't want to revert back to the pre design reviews work mode. It’s very hard for abstract work to be grounded on reality without getting your hands dirty with at least some of the more concrete work. It is far more natural for humans to extrapolate specific data points into the bigger picture than come up with the bigger picture upfront.

However, at this point we have shifted all the way to the other end, and many people both within and outside the TAG see design reviews as the primary deliverable, and principles work as a secondary one. IMO the primary deliverable should be the abstract work, with the primary purpose of design reviews should be to inform and test the abstract work. Design reviews are transient, they do not scale, while principles are meant to last and to simplify future design reviews. It's all about prioritizing short-term vs long-term gain.

It doesn't help that right now the design reviews give TAG more legitimacy in the eyes of both major stakeholders and the public compared to the abstract work. People understand what design reviews are, and having TAG review as part of the Blink shipping process gives TAG a lot of legitimacy.

But the reality is …it doesn’t scale. And what do we do when we're spread too thin? Delegate! In theory, with a set of principles that is sufficiently comprehensive, design reviews become a matter of identifying relevant principles, which requires far less expertise than authoring the principles.

Therefore, I wonder if a more scalable plan might be for the TAG to try and outsource design reviews to a broader group, and focus on principles work. To ensure quality, these outsourced design reviews would need to be grounded on principles, and we could always step in when really needed. Also, every time the broader group is unsure about something, they would bring the question to the TAG and it would inform principles work. But even if the TAG still had to sign off on every design review, it is far quicker to do that than to perform the design review.

There is a bit of a chicken and egg problem: having such a comprehensive set of principles is a pipe dream unless we can free up time to make principles work a priority. So I imagine this would have to be more of an iterative process, especially at first.

One of the issues currently is that design reviews do not always get attention in all areas (API design, security, a11y, etc.). Ideally each volunteer would have expertise in one or more areas and at least one volunteer from each area would need to weigh in. I could be wrong, but I think there are many people in the community that would be delighted to help out.

@tobie suggested task forces, and he could probably discuss the merits of that idea better. I had something less structured and more agile in mind, more like crowdsourcing with some amount of vetting.

I know this is a lot, and what I’m proposing is a big change, but without a substantial change, I worry a lot about the future of the TAG altogether. 😕

michaelchampion commented 8 months ago

I could be wrong, but I think there are many people in the community that would be delighted to help out.... crowdsourcing with some amount of vetting.

I agree with @LeaVerou, there are a lot more people with skills in specific areas who might serve on a crowdsourced review team than there are people with broad enough skills (and the time/travel support) to be on the TAG.

The AB does something similar by crowdsourcing the detailed work on the Process document to the Process CG, but retaining "strategic" control over the direction / priorities, and approving the result before it goes to the AC for review. Since I retired, they've done something similar with the CEPC/CoC. I suspect the TAG would have more success attracting "crowds": there are more people who care about applying web architecture / API design issues to the real world than there are folks who care about the finer points of the W3C Process.

tobie commented 8 months ago

My suggestion was for the TAG to delegate technical reviews to dedicated task forces, who'd be lightweight in process and tightly scoped (to review a single spec, or a group spec). These tasks forces would make the assessment, collaborate with the editors/WG, and would come back to the TAG with recommendations. The TAG would then formalize those recommendations.

I don't think this approach precludes a more crowd-sourced solution, quite the contrary, actually. We could imagine that the task forces could be picked from a broader roaster of community experts, or that both the TF and the broader community could collaborate on reviews.

This slightly more structured approach does have benefits here however because:

  1. There’s legitimacy carried through delegation and formal approval of the TF’s recommendations.
  2. Being formally selected to participate in a TF carries some level of recognition which can help justify the time investment to internal stakeholders.
  3. In an ever-changing technological landscape, the TAG's principles won’t ever be explicit and comprehensive enough that they can be applied directly to every new case. Indeed, if that was ever to be the case, the whole point of a TAG review would essentially become moot. So some TAG-vetting of the recommendation will almost always be required.
  4. This system provides additional flexibility by guaranteeing good representation at the TAG level through elections and domain expertise at the review assessment level through selection and crowdsourcing.

The Open Source Initiative has a similar process to certify software licenses based in part on a set of principles (the Open Source Definition).

When someone asks for a review of a license, there's an open process with some back and forth between community members, the license committee (what would be a dedicated task force in our case), and the author or submitter of the license. At some point, the committee makes a recommendation to OSI's board (this would be the TAG for us), which then decides to approve or reject the license. The license committee is comprised of domain experts (copyright lawyers), and the Board is community-elected. So you get this proper-balance between expertise and representativeness.

hober commented 8 months ago

One of the issues currently is that design reviews do not always get attention in all areas (API design, security, a11y, etc.).

It's almost like we need dedicated groups doing reviews in each core design area. "Horizontal" reviews, if you will.