Define the structure and location of architecture docs

sploiselle commented 7 years ago

The goal of this RFC is to come to come to a decision about where CockroachDB's architecture documentation should live. Thoughts on this are located in the Site Structure section below.

I've assigned folks who've expressed opinions about documentation structure. There isn't an immediate need for comment because the content isn't prepared yet, but I would like to start building this out next quarter.

Motivation

CockroachDB needs architecture documentation to answer the question, "How does it work?" This is a drastically different question than "How do I use it?" which the current documentation answers very clearly.

Given that NewSQL is a burgeoning technology, many people are unfamiliar with its ins and outs; rather than leaving them to infer how things work, it's important to provide canonical answers and educate users.

Audience

People who need architecture documentation are typically making high-impact decisions about application development––whether they're one-person development shops or working at large corporations. This means that the audience for these documents is primarily someone who needs to understand how CockroachDB works for a few reasons:

Vet the concept of the technology against their own understanding
Understand CockroachDB's complicated set of component interactions to help them use the product themselves (and potentially troubleshoot their own issues)

This content can also serve the purpose of making sure a group of developers share the same correct understanding of how CockroachDB works if they plan to develop against it (i.e. training for adopters).

Content

I've identified the following topics that will need to be covered (this list is not exhaustive)

Phase 1

The first phase of the architecture documentation will consist of an overview of CockroachDB, as well as the following section.

Layers

SQL API
Transactional KV Layer
Replication KV Layer
Distributed KV Layer
Storage Layer

Phase 2

The next phase will include how the architectural layers work together to provide the following benefits.

Benefits

Multi-Active Availability
Scalable SQL

Phase 3

The third phase is an incremental, iterative approach to provide individual, cross-linked pages for the following (not exhaustive) list. These pages will provide much greater detail than the layer page in which they belong, as well as provide better targets for users' search.

Components

Nodes
Postgres wire protocol
SQL Parser, Planner, Executor
DistSQL
Relational Structure
Monolithic Map
HLC
Ranges
- Replicas
Raft
Leases
Gossip
Zone Configs
Storage
- RocksDB
- MVCC

Processes

Transactions
- Contentions
Rebalance & Repair
Schema Change
Interleaved Tables
Enterprise Backup & Restore
GC

Phase 4

The fourth phase will tell the story of what CockroachDB does from an architectural perspective in the following scenarios.

Scenarios

DML
Scaling a cluster
Adding replication zones
Machine failure
Datacenter failure
Rolling Upgrades

Structure

Competitive Analysis

Most other database technologies have documentation with a similar aim, but fall all over the map in terms of execution:

ElasticSearch has an entire book on the topic (large detail/complex structure)
MemSQL has a single page
MongoDB treats its architecture acts as its product page w/o actual architecture (low detail/simple structure)
Cassandra on Data Stax has its own dir (large detail/complex structure)
Cassandra on Apache site just has a section, but there is really not very much there (low detail/simple structure)
Kafka architecture/design is just a section, but again, not much there (med detail/simple structure)

Inference: If we want to have a substantial amount of detail about the architecture of CockroachDB, we need to allow for a complex structure. To meet the needs of our intended audience, we need to include a lot of detail.

Proposal

General Approach:

Create one overview page that touches on the overall design of CockroachDB, focusing primarily on describing its layers and benefits
Create a page that details the architecture of identified element, with heavy use of cross-linking to other pages

Site Structure (Needs Decision Made) This content should be structured in a way that separates it from the user-facing documentation because it serves a very different intent/audience. There are two ways of achieving this:

Place all of the content in a subdirectory of the existing docs directory (cockroachlabs.com/docs/architecture)
Place the content in a main directory (cockroachlabs.com/architecture)

There are Pros and Cons of each tack:

Subdirectory
- PRO
- Easy to place in repo
- Benefits from versioning
- CONS
- Complicated search logic
- Expands number of pages that must be versioned
Directory
- PRO
- Can have its own sidenav
- Can easily have its own style to signify it's different from user docs (e.g. a different header color)
- Simplified search logic
- Decreases likelihood of users inadvertently finding themselves in architecture docs
- CONS
- Greater maintenance overhead
- Possible that it wouldn't share includes with docs repo
- Requires duplicate versioning logic or doesn't benefit from versioning work

General Search Considerations

You should be able to search only the architecture documentation, or reliably promote its content to the top of a query
All behavior should strongly prefer the user docs site content over architecture
Increasing the prevalence of the search feature in the design will also help ensure users stumble across the right thing
When browsing, visual indication that the content is about the architecture (and not using CockroachDB) is important to prevent confusion/attrition.

jseldess commented 7 years ago

Thanks for planning this out, @sploiselle! All the topics listed are great, but it's unclear to me what the basic outcome looks like. I like your idea to start with a "high-level architecture" page, where we can cover the abstraction layers and introduce physical components. Beyond that, though, I think it'd be most effective to start with a structure similar to how we describe the architecture at tech talks (e.g., the "How we do it" section of Spencer's recent talk), ideally one per page, but going much deeper than those slides:

Architecture Overview
Data Distribution
Consensus Replication (including fault tolerance)
Distributed Transactions
Simple Deployment (including rebalancing)
Online Schema Changes

We may need to break things out into more modular pages eventually, but starting with a simple topic structure feels intuitive and effective enough, especially since our Q2 aim is to get as much of this started as possible. We can iterate.

For placement, I don't have strong opinions. I think this content could work well either as a new section in the docs sidenav or as a higher-level division in the docs (like what Kubernetes does for Concepts, Tasks, Samples, etc.). We may want to do some light-weight user testing/interviews, with mocked up versions.

For search, I'm not as convinced that it's a good idea to cordon off architecture from product usage, at least not to start. Perhaps a visual indication of the type of documentation in the search results would be helpful, though.

Interesting in your and others' thoughts here. Excited that we're getting close to writing the content!

bdarnell commented 7 years ago

I vote for a subdirectory of the docs.

I'd lean towards starting with a relatively linear walk through the layers (either top-down or bottom-up) before we start building out a nest of cross-linked pages. Some of the proposed sections ("processes" and "scenarios") don't really seem like "architecture" to me but we do need to document these things somewhere.

For search, I think it's important that SQL keywords pull up the docs page that talks about that statement/expression. For other queries I'd be OK with letting google's search just do its thing until we see ranking problems (I expect the docs will naturally outrank the architecture docs).

sploiselle commented 7 years ago

@jseldess I think Ben's suggestion of starting with each layer aligns well with your suggestion; in Spencer's talk they're almost one:one. I'll rework the RFC to improve the communication of what these stages are.

I also didn't say that the two sets of content should be cordoned off from one another w/r/t search. Saying "cordoned off" was poor wording when I simply meant that the URL structures need to be distinct. This will provide us the ability to selectively search one set of content or the other (or both). I do think it's important that we provide some kind of visual indication as to which set of content you're looking at. This could be something as lightweight as putting "Architecture: [Topic Name]."

I'll also rework the RFC to make this expectation more prominent, as well as remove the language around cordoning off any content from anything.

@bdarnell I like the idea of starting with each of the distinct layers. I'll start with that framework.

For the processes and scenarios, I think it's important that users understand how those work from an architectural standpoint, both for completeness' sake and because scenario-based learning helps contextualize abstract concepts with more tangible actions, making them easier for people to learn.

jseldess commented 7 years ago

Thanks, @sploiselle. I like Ben's suggestion, too. And sorry for misunderstanding the search thing. I see what you mean about the benefit of a distinct url structure.

sploiselle commented 7 years ago

@jseldess 👍 –– let me know if the new phasing sections of the Content section help clarify what the work products look like or if you've got feedback there.

jseldess commented 7 years ago

Those phases look great. Thanks for the update, @sploiselle!

jseldess commented 7 years ago

For now, we've decided to add Architecture as a new section in the main sidenav. For 1.2, we'll likely have a distinct, separate space for this and other "concept" documentation.

cockroachdb / docs