writethedocs / www

The main website for Write the Docs.
http://www.writethedocs.org
Other
1.3k stars 503 forks source link

Use Case Study: Documentation Workflow at DataStax #546

Open polandll opened 6 years ago

polandll commented 6 years ago

I wrote this at WriteTheDocs 2018 in Portland, at Jennifer's request. I believe the place to put this writeup will be in the "Documentation Culture at your Company" as a new item in the bulleted list. @Bradamant3 @plaindocs

Use Case Study: Documentation Workflow at DataStax Author: Lorina Poland (lorina@datastax.com) I am responsible for any errors in this content.

At DataStax, we currently have a team of 8 remote full-time writers. In general, we each are primarily responsible for some aspect of our products, with the equivalent of 2 writers predominantly responsible for "docops", the publishing and deployment infrastructure and docsreview server for internal user commenting.

We use github to manage our documentation, and use Oxygen XML Editor to produce DITA source files. Writers mostly use command line to interact with github, although some do use front-end programs for git. The github repository is private, but accessible for everyone in the company. We originally had a separate private git account, but found that moving our repo into the company's git account has enabled more feedback from internal users.

We use Oxygen XML Editor mainly, even though some specialized knowledge is required to do that. (We have had some developers who have used Oxygen, but we realize that it is not a tool developers are happy to learn.) We use Oxygen for the following reasons:

All writers also use other editors such as Sublime or Atom. Some of us even use vi on occasion (?!).

We use in-house developed scripts (ant, CSS processing, FO transformation) to publish our docs. HTML, PDF, and plaintext are all produced, based on script parameters, and can be both internally published on a docsreview web-based server that we maintain, the main HTML documentation available (docs.datastax.com), and plaintext files that are supplied back to the developer teams for inclusion as inline help for commands used in specialized query shells. Although we tried to do away with PDF documents and only supply zipped files of HTML, we discovered that both internal users and customers wanted PDF that they could annotate with their own notes. We have recently implemented the Chemistry module which generates PDF output from HTML5/XML, styled with CSS.

We also process developer created documentation for our drivers. The driver engineering team built an in-house tool that is used to produce documentation from markdown, and we incorporate that into our main docs at publication.

The publishing scripts allow any writer to build a document at any time, so that we can do continuous improvement of our published docs. The build documents can be pushed to the docsreview server, or deployed (a separate step) to a docspreview server. Lastly, the docs that have been deployed can be synced to the live website that customers and the public access.

The separate docsreview server that we maintain is accessible only to internal audiences (Support, Engineering, Marketing, Developer Relations). Two uses for this server are: -- Reviewers can highlight and comment, and writers can access a list of commits made to make changes in the docs source. Generally, this method is used for small comments, rather than extensive comments. -- The bonus is that the docsreviewer always has the next unpublished release, so everyone in the company can see the upcoming changes and get a jump on understanding what's new. We also maintain old versions on docsreview, so that internal users can continue to supply us with comments on outdated, incorrect, or incomplete information, so that we can fix the source.

We currently have two private-to-the-company docs github repos. Two releases back, we did a major revision of the organization of our docs, and in the process, created a new repo. We have also maintained a separate repo to post externally accessible files, such as zip files of our example code, although this repo is not currently well-used, and we'll likely move those files to a centrally maintained CMS designed mostly for the Developer Relations (Technical Training, Evangelists) team.

We generally have about 4 major releases we are supporting, a future release and three existing releases. A minor patch release is produced often, although the release schedule can vary from 2 weeks to 6 months.

We use branches to differentiate the releases. The master branch is always the future release, and each major release is a separate branch. Currently, we create a separate branch for each minor release. For instance, a major release branch is dse51 () for the product DSE 5.1, and a minor release branch is dse519 () for DSE 5.1.9.

We commit directly to the minor release branch for most changes, and then merge that branch into the corresponding major release branch at release time. So, when it's time to release DSE 5.1.9, dse519 is merged into dse51. However, recently we have had some minor releases stretch on so long that we have cherry-picked the changes into the major release branch, to reduce the number of merge conflicts. In general, the writer who handles all the release notes is the person who creates the new minor branches after the last minor release branch has been merged.

One of the issues we've run into using git is missing/overwritten work. So, git is designed to allow multiple users to write code or documentation simultaneously, and to deal with the possible merge conflicts that can occur if two people edit the same lines of source. And in general, that is supposed to work. However, after investigating that, in fact, we have had work overwritten, we believe we traced it to interactions between Oxygen and git, and a simple human error involving the state of an open file in Oxygen at the time of a git pull command. It is likely due to all writers pushing their changes directly to the github repo. So, while git is not directly responsible, we have decided to move to a slightly different workflow using pull requests.

Since we use JIRA tickets to manage our documentation work, we'll use one pull request per ticket. Multiple git commits may be involved on a ticket, to accommodate both the doc content and the release note. This will also allow us to institute both technical review and peer review for every documentation change. We may still point a reviewer to the docsreview server to make comments, since seeing a whole page in context can be helpful. Pull requests more strictly require merge conflict to be resolved before pushing the commits, though, and we hope that this change will resolve the problems that we've seen.

plaindocs commented 5 years ago

Maybe someone wants to tackle this on Writing day in Prague.

Bradamant3 commented 5 years ago

If I'm there, I'll tackle it. If I'm not there and nobody else takes it, I'll still do it -- I'm the one who dropped the ball last year :-(

polandll commented 5 years ago

We have actually learned quite a lot about doc workflow since I wrote this up a year ago. I should probably revise it to be more current and useful to others.