Develop a philosophy and flow for our internal docs

hicommonwealth / commonwealth

A platform for decentralized communities

https://commonwealth.im

GNU General Public License v3.0

67 stars 42 forks source link

Develop a philosophy and flow for our internal docs #4353

Open gdjohnson opened 1 year ago

gdjohnson commented 1 year ago

Description

Separating this from the migration ticket, #3379, since I see this task as more a slow-burning, long-term project. This ticket's comment section will allow us keep track of our evolving thoughts on docs philosophy and processes.

What are some key areas that need to be thought through?

Process:

Process for adding, editing, reviewing, and approving docs. What are our expectations for devs and chroniclers, in terms of documenting new functionality, codebase changes, etc?
Cycles and routines of periodically reassessing and reauditing doc quality.
Long-term self-maintenance and sustainability. This one is least pressing, but unless we always want (or are financially able) to keep an archivist around, we should begin thinking about the sorts of sustainable, developer-driven organizational structures/incentives that will prevent documentation decay.

Content:

Doc entry templates, formatting, and layout.
Macro organization of files and entries (directory structures, inter-linking, tagging).
Balancing timely and timeless documentation styles.

Relationship with rest of codebase:

Means of linking between docs and code entries, so that the two stay tethered, and either one (e.g. the pertinent docs, or code area being documented) is discoverable from the other.
The role of inline comment documentation, and how inline comment documentation relates to full-length doc entries.
Relationship to unit tests.

gdjohnson commented 1 year ago

Previously

Moving context over from #3379.

The old Notion docs tracked document completeness/reliability—I think having some similar tagging system would be useful. We're pretty limited with pure Markdown files, unless we want to start inserting JS snippets to parse/filter files based on their contents—but we could at least have some metadata stored at the top which is visible/searchable.

What sort of metadata might be useful?

DRI for the area, so team members working off incomplete or ambiguous docs know who to contact for clarification. (Note that in many cases, this may not be the individual responsible for the git blame, e.g. if I have worked with a DRI engineer to write & submit the documentation myself.)

Latest verified date that the docs are relevant to. Again, this may not be the same as timestamps obtained through git blame—someone may update a small section of an entry without auditing and verifying the entire file for accuracy/relevance. Completeness/quality indicator. In the old Notion docs, we used red/yellow/green, with red being docs being either seriously incomplete or out of date, yellow being partial and useful but in need of attention, and green being basically verified (e.g. by a DRI, the Archivist, or a team lead) as up-to-date and basically complete.

Why keep track of documentation status?

To track which docs are in poor shape & need love. Doubles as a call to action: engineers using the docs can quickly and synoptically which doc entries need work, which should hopefully inspire some hole-filling in our knowledge base.

To give engineers reading from the docs some context as to the quality of the docs they're dealing with. An engineer might spend an hour struggling to interpret documents that have fallen out of date since time of writing, or which were put together in a slap-dash fashion and are now being (mistakenly) treated as authoritative.

To give team leaders a sense of doc entry quality before they pass off work to e.g. contractors. If some project-relevant area of the codebase is underdocumented, that can be quickly noticed and addressed, rather than the manager having to personally assess doc quality en toto.

One problem with tracking completeness is how is that measured? We don't have that defined, so we would first have to determine what we even mean by that. Second concern there is that previous DRI approach did not result in dev docs matriculating in completeness (from red to amber to green) but that mostly, a sprawling collection of stub dev docs labeled with red or amber for "nothing here" and "not much here" respectively. Third question would be around usefulness of legacy approach to dev docs, which AFAICT was never used by engineers to learn how the system works, either to grow knowledge generally, or even in particular cases of an engineer being blocked and

And @CowMuon writes in response:

One problem with tracking completeness is how is that measured? We don't have that defined, so we would first have to determine what we even mean by that. Second concern there is that previous DRI approach did not result in dev docs matriculating in completeness (from red to amber to green) but that mostly, a sprawling collection of stub dev docs labeled with red or amber for "nothing here" and "not much here" respectively. Third question would be around usefulness of legacy approach to dev docs, which AFAICT was never used by engineers to learn how the system works, either to grow knowledge generally, or even in particular cases of an engineer being blocked and

gdjohnson commented 1 year ago

Continuing the conversation from #3379.

@CowMuon I think the situation is meaningfully different now then it was when I established the Notion wiki, and different too from when you inherited it.

The color-demarcation system (red/yellow/green) was built to serve two functions. First, it was envisioned as a way for engineers to specifically flag (along with comments) stale or incomplete docs, which they came across organically in the course of using the docs, using an easy-to-tag, readable-at-a-glance color system. Second, I imagined it as a way to take an "agile" approach to docs, where I could publish documentation for some part of a component/file, and flag it as incomplete so I could come back to it later.

This system failed, IMO, because of our overwhelming push for new feature development at that time—not because of the intrinsic/formal features of the system itself. After setting up the initial docs system—on the premise that I would be able to devote 1 day of my work week to maintaining/expanding it—this maintenance/expansion time allocation never materialized, and there was no one around to freshen up stale docs, or add missing sections to incomplete entries. (This was in large part my fault, for not advocating strongly enough for its importance, but we were juggling many priorities...) The Notion docs system failed also, I think, because a large amount of stubs were added (not by me) without any intent or possibility of their ever being built upon. My new presence as a dedicated docs-handler, in addition to our new PR system for approving all new docs additions (including any stubs), should markedly ameliorate both of these problems.

I'm not attached to the legacy system, and am happy to go other directions. But these are my two takeaways: (1) I don't think that formal features of earlier systems should take the blame for what was a failure of resource allocation. (2) An overwhelmingly repetitive takeaway of my research into docs systems—both best practices/failed approaches at major software companies, and also my team-specific one-on-ones this past week—has been that doc staleness is one of the biggest issues a successful documentation system needs to solve.

We could try debundling the affordances of a status tracker (e.g. R/Y/G), and solve them independently. I could (1) set up a system for engineers to report doc problems + codebase areas in need of documentation; (2) have some entry-scoped, minimum viable staleness tracker like a "last certified fresh" timestamp; (3) keep track of desired changes/updates to various incomplete entries, so that per request, I can iteratively publish docs in an agile way, rather than waiting til they're perfect & fully complete.

I'll continue my research into how other software teams handle this, and can continue pitching alternative systems. But these are the problems in need of solving, as I see them.

dillchen commented 1 year ago

Preface: If this is not the place for this comment please delete it.

I think this thread and set of thoughts are great and true. I just wanted to add two questions here:

What does success look like? A few boxes / quantifiables instead of tradeoffs, for me would be helpful to know (for example, "if our documentation is good, we should expect that it's able to resolve some % of tickets / questions without needing intervention", "it should require X amount of time to keep in line and maintain", "it should be maintainable / fit into existing workflows for engineers" (these are a bit more strongly stated than the Process section you've outlined above)
Second and larger in scope, as research is done, I strongly desire examples that are specific to web3 and for an ecosystem that explicitly (I.e Obsidian). We are already open-source, in name, and documentation is very important surface area for other potential contributors (beginner -- tutorial level, intermediate -- writing plugins, expert -- proposing core changes). And so worthwhile to consider those personas as it's not "just" internal documentation.

gdjohnson commented 1 year ago

Good questions and yes, this is the place to have these sorts of conversations!

I think there are some open questions still about our inline documentation system (most likely TSDoc), but we have a firmer grasp on the wiki entries right now, so I'll focus on those.

What does success look like? A few boxes / quantifiables instead of tradeoffs, for me would be helpful to know (for example, "if our documentation is good, we should expect that it's able to resolve some % of tickets / questions without needing intervention", "it should require X amount of time to keep in line and maintain", "it should be maintainable / fit into existing workflows for engineers" (these are a bit more strongly stated than the Process section you've outlined above)

I think quantifying "time saved" may not be possible, unfortunately. We'd have to do a really thorough audit of new engineers' onboarding processes (& of the experience of outside contributors), and I'm not sure our sample size is large enough to get an answer over a timeline quicker than 12ish months. It might not be a bad idea to start keeping track of eng question-asking and blockers (e.g. new engineers reaching out to more experienced engineers/team leads, asking open questions in Slack channels, getting stuck on assigned tickets due to lack of knowledge/code legibility). But I think that will be a slow process of figuring out how to do it properly—and there is of course the cost in eng & documentarian time to tracking this info.

As for maintenance, the hope is that engineers will only have to update docs in two situations: (1) Either they have, in the course of working on a ticket, realized that existing documentation is insufficient. In the process of gathering the necessary information to complete their ticket, they also store it in our Wiki, and the documentation is included in their PR, or (2) In the course of working on a ticket, they realize that existing documentation will be made obsolete by their changes, and update obsolete sections accordingly. I've been broadcasting an open offer to engineers that I'll happily hop on a 1-on-1 anytime they'd like help writing documentation. So, to answer your question in (unfortunately, a qualitative and not quantitative way), I think our documentation system will be successful if these two cases are, broadly speaking, the only time documentation requires engineer time.

Second and larger in scope, as research is done, I strongly desire examples that are specific to web3 and for an ecosystem that explicitly (I.e Obsidian). We are already open-source, in name, and documentation is very important surface area for other potential contributors (beginner -- tutorial level, intermediate -- writing plugins, expert -- proposing core changes). And so worthwhile to consider those personas as it's not "just" internal documentation.

Good to know re: creating doc surface area for outside contributors. I'd be very curious @CowMuon's thoughts on macro priorities. If we do decide to begin prioritizing this additional surface area, a good first step might be auditing opened issues, and looping me into any channels/conversations where outside contributors reach out to our team for help & guidance. I'm just not very exposed to that side of Common, so I'm not sure what our needs are.

dillchen commented 1 year ago

"I think quantifying "time saved" may not be possible, unfortunately. We'd have to do a really thorough audit of new engineers' onboarding processes (& of the experience of outside contributors), and I'm not sure our sample size is large enough to get an answer over a timeline quicker than 12ish months. It might not be a bad idea to start keeping track of eng question-asking and blockers (e.g. new engineers reaching out to more experienced engineers/team leads, asking open questions in Slack channels, getting stuck on assigned tickets due to lack of knowledge/code legibility). But I think that will be a slow process of figuring out how to do it properly—and there is of course the cost in eng & documentarian time to tracking this info."

That seems fair. On this, I have one item / suggestion / thought. Ideally a special purpose bot (similar to this) link. Happy to add more on this, but very vaguely. Manually editing is frictionful. Ideally a flow would be @docsbot in Slack and it would respond, it would update if necessary things in docs (submitting a PR), from there could be reviewed by you or other Code Owners. Ideally it would keep track of if there are repeat requests and more as well.

dillchen commented 1 year ago

On this point

Good to know re: creating doc surface area for outside contributors. I'd be very curious @CowMuon's thoughts on macro priorities. If we do decide to begin prioritizing this additional surface area, a good first step might be auditing opened issues, and looping me into any channels/conversations where outside contributors reach out to our team for help & guidance. I'm just not very exposed to that side of Common, so I'm not sure what our needs are.

I think for now, it's out of scope. Where, Internal Dev Team documentation is an "MVP" of sorts. I would not want to add this to current scope, but as a second "stage" of the project, once internally, the docs, process and maintenance workflows are established.

Very related to this. Farcaster does a great job on conversations around "protocol" or "domain level" improvements. See here for example -> https://github.com/farcasterxyz/protocol/discussions/categories/fip-stage-4-finalized. Of course, Ethereum is the gold standard here, see EIPs

As we work on 2.0 Workstream, there are places where there are questions around interfaces, and concretely memorializing changes to the interface, domain would be desired.

That said, again "out of scope" but worth keeping in mind, what's beyond

gdjohnson commented 1 year ago

That seems fair. On this, I have one item / suggestion / thought. Ideally a special purpose bot (similar to this) link. Happy to add more on this, but very vaguely. Manually editing is frictionful. Ideally a flow would be @docsbot in Slack and it would respond, it would update if necessary things in docs (submitting a PR), from there could be reviewed by you or other Code Owners. Ideally it would keep track of if there are repeat requests and more as well.

I've been looking into tools that link documentation files to documented code files, so that if code files change, the relevant documentation is flagged for updating. Think that could help us track things.

In terms of non-manual updating, our biggest failure mode historically with docs is that they are low quality, and become more work & clutter to use & navigate than they save. On that note, my biggest priority with docs is making sure that they're consistently excellent, and that there is no fragmentary or poorly written documentation whatsoever in our wiki. I'm working on auditing & updating our existing docs under that banner.

gdjohnson commented 1 year ago

x-posting Jake's comments from #4736 discussion:

I would like us to approach a standard for locating documentation as well as just when to update. My opinion is that, in the long term, docs should live alongside relevant code, rather than in a separate folder. However also would like to get some insight on what best practices are here. We may want to look at how Github internally does documentation and see what we can copy from their practices.

gdjohnson commented 1 year ago

One open question on my mind right now is how to handle our many package ReadMes.

Do we duplicate these ReadMes inside the Wiki? This violates DRY and makes our docs prone to falling out of sync. (Slash it's just plain annoying to keep two copies updated.)

Do we keep these ReadMes where they are, and add pointers from our Wiki? I don't hate this idea—we could simply add a TOC section for Package ReadMes, and link from there.

What is the boundary between "information that belongs in a ReadMe" and "information that doesn't"? Right now, our Commonwealth package ReadMe has references to installing and starting the app, but for basic database commands (which are also documented in our in-progress Package-Scripts.md entry), as well as a list of .env variables, linter instructions and frontend code style, configuring custom domains, using Datadog and Heroku, setting up local testnets, etc.

Some of this information isn't even documented anywhere in our main wiki yet!

This ties directly into @dillchen's comment about our two docs audiences: internal team + outside contributors. I've generally gotten the sense that our ReadMe primarily targets outside contributors, whereas our internal docs (although accessible to outside contributors) are internally oriented. But we are better off shrinking down the ReadMe, and linking out from it to more in-depth documentation on the Wiki.

gdjohnson commented 1 year ago

Going to use this space to think aloud about our "Certified fresh" system, and metadata generally, since I am not satisfied with "Certified fresh" as a solution.

What are the goals of tracking metadata?

Giving engineers some sense of how likely documentation is to be complete and up-to-date. It can be frustrating to work through a set of steps only to realize midway that they've been obsolete for over a year.
Letting engineers know who they should reach out to when working with docs. (Either because they are editing/extending the documentation, or are actively using it and have run into ambiguities, or because the page is over a year old and they want to ensure it's still accurate information.)
Allowing the documentation team to track which docs are "most precarious" so that they can be specifically targeted for audits. (E.g. a wiki entry that was successfully reviewed or followed by an engineer last week is less in need of audit than an entry that hasn't been reviewed in over a year.)

Why is git version history insufficient?

Documentation may be reviewed and certified accurate/up-to-date by a member of the engineer or documentation team, in a way that does not show up in git history.
The code author is not necessarily the documentation author.
- Documentation migrated from GitHub Wiki will show as having being authored by the migrator.
- Documentation may have been written collaboratively between two engineers, or between an engineer and member of the documentation team.
- We have had several codebase wide re-architectures, e.g. JSX-ification, Reactification, and migration to a new repo, each of which has destroyed or severely damaged the integrity of code's git history.

Why is "Certified fresh" insufficient?

In theory, all newly authored documentation must be reviewed and certified accurate when added to the wiki. However, we have a backlog of nearly 100 entries which were not submitted to this process. Labeling them "Certified fresh" would be misleading as to their true status, but providing no authorship or datestamp data whatsoever is hardly a better solution.
Reviewers may only be able to attest to certain sections of the documentation.
Reviewers who certify a page fresh will overwrite the original authorship, and engineers who go to use the docs may be misled as to who they should contact for clarifying information.

What we don't want: DRIs

Per previous conversation (incl. upthread) with eng leads, we are trying to avoid DRIs, which would be one way to deal with these problems.

Proposal: Change Log

In this model, we would retain the "Certified fresh" syntax as a label which an engineer or documentarian can add to docs that they have recently, successfully used or verified. However, this label would be one of several standardized labels that could be provided in an entry's change log along with a custom description.

Example Change Log

## Change Log
- 230903 Certified fresh by Graham Johnson
- 230812 Edited by Daniel Martins (Updating config steps in light of recent database changes.)
- 230705 Authored by Jake Naviasky

gdjohnson commented 1 year ago

This styleguide is one of the beter sets of recommendations for dev docs I've encountered.

Much of the advice agrees with our current approach, but bears reiterating:

A small set of fresh and accurate docs are better than a sprawling, loose assembly of “documentation” in various states of disrepair. Write short and useful documents. Cut out everything unnecessary, while also making a habit of continually massaging and improving every doc to suit your changing needs.

Change your documentation in the same CL as the code change.

Your mission as a Code Health-conscious engineer is to write for humans first, computers second. Documentation is an important part of this skill.