Changing guidelines on Documentation

10up / Engineering-Best-Practices

10up Engineering Best Practices

https://10up.github.io/Engineering-Best-Practices/

MIT License

760 stars 204 forks source link

Changing guidelines on Documentation #195

Open nicoladj77 opened 7 years ago

nicoladj77 commented 7 years ago

I would like to start a discussion here on what are the best practice on commenting code.

In the last year and a half, I basically stopped writing comments on code unless I was really forced because the comment was absolutely needed. There are some reasons for this:

comments rots.
docblocks rots much faster than comments.
whenever source code is available ( for example, core Wordpress ) I never ever read comments or docblocks for the two reasons above. I think we always have source code available.
you can usually ( 99.9% of the times ) avoid a comment by small refactoring: extract a method or rename a variable.
I think comments should never explain "what" you are doing: that's clear by reading code. It's ok to use comments to explain "Why" you are doing something, although...
I almost always find out that, when i NEED to use a comment, my implementation could be better. A comment is a code smell.
code becomes much shorter and easier to read.
it promotes good code. Comments usually are the lazy way out.

Let me know what you think.

tlovett1 commented 7 years ago

Hey @nicoladj77, I believe there is a middle ground in the "comment all the things" vs. "if it requires a comment, it is too complicated school of thought".

Personally, I maintain a number of code bases, both for 10up and personal projects, some of those projects I update once a year. Docblocks and inline comments are indispensable for remembering what I was thinking when I originally wrote the code. An example would be Safe Redirect Manager; I rarely update the plugin; when I do, the comments are helpful for me to process some of the complex redirect logic. I'm sure the code could be improved, but the complexity of the problem being solved requires commenting.

Another project example where comments are important is ElasticPress. The ElasticPress code base is very large and reaches it's "tenticles" into most areas of WordPress (after all we are replacing the query engine). There are comments in ElasticPress explaining quirky code that were introduced years ago for a very niche reason e.g. EP not being able to support the WP_Query "fields" argument.

I disagree with you on the "rate" of documentation rot. Code should be properly tested and commented as it's written; if changes are made, documentation should be changed. Obviously, in early stages of a code base things are changing rapidly. However, once a project is more stable I don't personally see docs rotting that often.

Would love to hear the rest of 10up engineering on this topic.

Thanks for starting this thread!

lkwdwrd commented 7 years ago

I personally find well-written and maintained DocBlocks indispensable. The problem is most engineers don't update them when they refactor -- to Taylor's point when code is changed, the DocBlock should be update as well. The other thing that I see very often is a super minimal DocBlock comments. A very vague short description, param definitions with types but no description. In my opinion we should be striving to maintain very good internal documentation within our codebases. It's not easy, but it can be worth it.

Look at the WordPress code reference: https://developer.wordpress.org/reference/

I have this open all the time when I work on WordPress code. The main content of the reference all comes from parsed inline comments. I find it faster than searching a copy of core with grep like I used to -- and I do get to see the code inline on the page as well. This is only possible because of well kept inline documentation. I would like to see a world where our code is this well documented and we could parse mini code references for our projects. I'm not saying this is always practical or necessary, but it's something to strive for.

On Wed, Jul 5, 2017 at 9:26 PM, Taylor Lovett notifications@github.com wrote:

Hey @nicholasio https://github.com/nicholasio, I believe there is a middle ground in the "comment all the things" vs. "if it requires a comment, it is too complicated school of thought".

Personally, I maintain a number of code bases, both for 10up and personal projects, some of those projects I update once a year. Docblocks and inline comments are indispensable for remembering what I was thinking when I originally wrote the code. An example would be Safe Redirect Manager; I rarely update the plugin; when I do, the comments are helpful for me to process some of the complex redirect logic. I'm sure the code could be improved, but the complexity of the problem being solved requires commenting.

Another project example where comments are important is ElasticPress. The ElasticPress code base is very large and reaches it's "tenticles" into most areas of WordPress (after all we are replacing the query engine). There are comments in ElasticPress explaining quirky code that were introduced years ago for a very niche reason e.g. EP not being able to support the WP_Query "fields" argument.

I disagree with you on the "rate" of documentation rot. Code should be properly tested and commented as it's written; if changes are made, documentation should be changed. Obviously, in early stages of a code base things are changing rapidly. However, once a project is more stable I don't personally see docs rotting that often.

Would love to hear the rest of 10up engineering on this topic.

Thanks for starting this thread!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/10up/Engineering-Best-Practices/issues/195#issuecomment-313277949, or mute the thread https://github.com/notifications/unsubscribe-auth/ABztOs0LlpcCAJ9rsty3bAnDIlEyYcz5ks5sLEXwgaJpZM4OLDFt .

moraleida commented 7 years ago

On a more practical note, it is much easier for a Code Reviewer to say "please add comments to your code" or "please adjust the docblock to reflect the new functionality" than to constantly evaluate if an evolving class/method would be easily understandable to someone taking over this project a year from now - to me this is a good enough reason to keep comments.

Maybe we need more detailed instructions on how to write these comments?

bswatson commented 7 years ago

Specifically talking about DocBlocks -

Creating detailed comments for functions and classes which include:

What is the purpose?
Why is this necessary?
Is there business logic that will be encoded here? If so, describe the logic in plain easy-to-read language.
What are the expected inputs and outputs?
Are there cases which may require special handling of returned data?

Given that data, it's much easier for someone reviewing the code to identify if that code was well thought out and handles the input, output, and logic as expected.

If your code starts suffering from documentation rot even though you've answered the above questions, then it's very possible that you've modified the function in a way that it wasn't intended to be used and another approach would be better.

dana-ross commented 7 years ago

As far as the Engineering Best Practices go, I think what we have is good. It gives minimal advice on when to comment, instead focusing on the details of what styles we prefer. It also specifies that we follow the WordPress coding standards, and has a section on SassDoc.

Even though we defer to the WordPress coding standards, I think as long as we specify SassDoc, we should also call out PHPDoc and JSDoc as our Best Practices.

In general, I don't think we should get too prescriptive about commenting in our Best Practices. I think docblocks are important, and they should continue to be part of our standards. In many cases, a linter will catch when the number & types of parameters are wrong in a docblock. Same goes for return types. The written comments may not reflect the code, but it's the responsibility of our engineers to make sure that doesn't happen, and I consider that as important a part of their job as writing the code itself.

I think formal docblocks are especially important in environments where we can't rely on type annotations. Modern linters & IDEs parse docblocks for type information and provide some level of type checking, even if it isn't perfect.

Beyond docblocks, the questions of when & how to comment are judgement calls and depend a lot on the context in which the code lives and the team that works on it. I would love it if 10up engineers used their training budgets for a copy of Code Complete and paid close attention to the chapter on "Self-Documenting Code", which is actually more about commenting. There's decades of debate & lessons learned that we could all benefit from, along with research.

nicoladj77 commented 7 years ago

@bswatson regarding your example:

What is the purpose? This should be clear from folder structure and naming. "Taxonomy/Personality.php" to me holds this information Why is this necessary ? I think this is not related to code. keeping with the previous example, "Taxonomy/Personality.php", I think it's clear why this is necessary: we need to handle a Personality tax. The reason why we need it are not code related Is there business logic that will be encoded here? If so, describe the logic in plain easy-to-read language. Move the business logic to a method with a good name or to a function, and you don't need this anymore What are the expected inputs and outputs? Typecheck inputs and outputs. Returning multiple types ( like false or an empty string should be discouraged ) Are there cases which may require special handling of returned data? Throw exceptions with meaningful names

@daveross I agree that "it's the responsibility of our engineers to make sure that doesn't happen, and I consider that as important a part of their job as writing the code itself." but if we need to quantify it, how much time do we spend checking comments and checking docs are up to date and so on?How efficient is that?Do end users ( other developers ) really use all this documentation? I used myself as an example to get the discussion started, but I always skip the documentation and look directly at code. Reading a lot of other people code, I almost never wished for more comments, i usually wish for better naming, or shorter functions. I know that this is a LONG discussion, i don't want to start a flame about this 😄 , I just wanted to share my experience and start some discussion on this.

stevegrunwell commented 7 years ago

Echoing the general sentiment here, documentation is invaluable, especially in environments where developers are constantly being swapped out mid-project.

Disclosure: It's well known that I ❤️ good documentation, and have recently been giving a talk on the subject. As a result, I'm definitely biased in favor of docs.

With regards to documentation rot, I can't recommend strongly enough making documentation review a required part of code review. Changed the function signature but didn't update the documentation? That's a good reason to kick the code back to the developer to fix. Nothing goes in without documentation, nothing significant gets changed without the docs being updated accordingly.

From personal experience, a project without documentation is far more difficult to transition onto. It can also be a major code smell: projects with little or no documentation — especially in a team environment — are often the ones that require a day or more to get ramped up on, have few or no tests, and/or completely fall apart when the team supporting it changes.

nicoladj77 commented 7 years ago

@stevegrunwell I think that bad projects are bad, with or without documentation. what I'm arguing is that the time spent on keeping documentation up to date could be better spent on enforcing clear code. This is to answer also @moraleida and as an example: whenever as a reviewer you are asking for "more comments", what are you truly asking for, in truth?Usually clearer code. When comments are added, do you really feel that whoever will need to maintain that code in one year, will understand the code better because of the comments?Will the need to review and maintain comments in the future justify them? Because i agree that comments should be treated as code ( like tests ), but that comes as a cost. I feel that whenever I add a comment, I'm adding something I will need to maintain in the future. My point is, do we really need all the bloat? Because code without comments, or simply code where only meaningful comments are included is much shorter and thus easier to maintain. @tlovett1 I probably sounded a little too harsh, I'm sure that there are good comments! A good example is what you are saying in elasticPress where you explain "Why" you are doing something and not "what" you are doing, as the "what" is clear from code. @lkwdwrd from experience, if I have source code available, I always look at source and never ever read documentation, just because there is a possibility that the world is not perfect, and docs could be out of date. I agree with you that in a perfect world docs are always updated, but in a perfect world you would not need docs because code would be so clear that docs would not be required 😄 My feeling about documentation is that, as usual, it depends on the project. Shared projects like WordPress can't be compared to client work which sees 8 developers work on a codebase in two years. Also our codebases are usually really easy, I've not seen anything which I would consider "complex". I also strongly feel that you can't compare the documentation need for projects where the source code is available with projects where source code is not available: DocBlocks are of course more important in languages where you can't see the source code. Java is not PHP, luckily 😄

nicoladj77 commented 7 years ago

Of course thank to everyone for sharing opinions! :)

ivankruchkoff commented 7 years ago

@nicoladj77 I agree with you that comments DO tend to rot. The only way to avoid code rot is to have comment quality enforced at CR, and buyin from the team that it will happen, and buyin from the team maintaining the project in 6 months, and from the team maintaining the project one year down the track.

This also includes projects (some, not all) that get to that phase towards the end where there is still a monumental engineering effort required and people need to burn the midnight oil to meet delivery.

There is a HUGE cost to documentation, and the benefits don't ALWAYS materialize. ElasticPress (per @tlovett1 suggestion) and WordPress (per @lkwdwrd ) do need good documentation, they're long running projects that need to make the barrier to entry very low, so people can figure out how they work and start contributing to the codebase. Strict code standards also make it easy for large teams of engineers to work together.

Regarding your point about refactoring vs commenting, which has a lower overall cost? Sometimes one, sometimes the other. I do agree with the point for a lot of scenarios, but I can't think of a way to codify the exception as a best practice.

Keep refactoring your codebase until it's a simple as possible and only comment the difficult parts are nice mantras in theory, but as engineers, we need this quantified: What is simple? What is complex? Who is the judge? What is the cost of a feedback loop in CR?

In summary, I think the level of commenting that a product (EP/WP) has, should not be the same level of commenting that a client's theme build has. I don't think that we should omit comments altogether, nor do I agree comment liberally for every function.

A good exercise to perform is, see how much time (HH:MM) the following steps take, do it for a task that takes more than 2 hours: 1) Write your code until it works. 2) Refactor code until it's easy to follow. 3) Add thorough documentation. 4) Put out a CR. 5) Address all feedback.

Tally up those times, multiply by the project's hourly rate, and ask yourself which parts were worth it, and which weren't. You may surprise yourself, you may not.

ivankruchkoff commented 7 years ago

I always look at source and never ever read documentation

I disagree with this part, I'll always follow the function spec rather than the implementation, because that's the only way you'll ever find documentation bugs. Also, I hate reading through code to figure out how something works when it's easier to just be told how it works. Take a look at this function, first look at the function:

https://gist.github.com/ivankruchkoff/841885495873a1fe2ec0645b079f927e see how long it takes you to figure out how it works and should be used, then take a look at it with commenting

https://github.com/Automattic/vip-mu-plugins-public/blob/master/vip-helpers/vip-utils.php#L397 and see how that compares with how long it took you to infer from code. For me, I gave up after 3 mins of looking at it (it's good to be lazy), and grokked it from comments in 20 seconds. (line 397)

morganestes commented 7 years ago

We all probably have good anecdotal evidence for removing or increasing inline docs ("clients pay for working code, not comments"; "if you can't understand the code without comments, then you're not good enough to work on it", "we write code for humans to read and computers to run"), but I'd like to reference a post written by JJJ and linked to from the WP Core Dev handbook:

Inline documentation is the nonce-check of the open-source universe; it promises what’s actually happening in gibberish-land matches the intention of that section of code. There are a million excuses not to document your code, but it all boils down to either:

Being selfish.

Being lazy.

Being in a hurry.

Not knowing any better.

Being so insanely good at reading everyone else’s code that documentation has never in any way helped you.

I'm loathe to believe that someone purposely wants to delay or derail a project because they didn't take the time to write docs that would benefit other developers (assuming they know the code so well that they don't need it documented while creating it). I think often we slip into the thinking that writing and maintaining comments isn't as important as the "real" code since they just take up space and don't get run. I posit that comments are code, and as much care needs to be taking in writing them as in the "functioning" code.

I often learn how something works by reading the source code, which includes well-written and accurate documentation. I rely on that documentation to be well-written so my IDE can parse it and provide hinting, and in the case of core, so it can be parsed into a website and shared with others as a centralized training tool.

With good comments, I can trust that what (and why, especially) a function or class is promising to return is accurate and write my own code based on that trust. With incomplete comments, I can still get a feel for the code and the IDE can fill in the blanks. With no comments, I have to take the time to mentally parse a function's internals every time I come use it (at least for the first few times).

That all said, I advocate for thinking of comments not as something to replace a detailed users's or implementer's guide, but as a way to provide hinting to your editor, make onboarding less painful, and maybe even introduce a slight speedbump to ensure you can explain why you're writing that code so that others understand it. With that in mind: always include docblocks, write comments on smaller parts when needed (if you have to explain it to someone in a code review, take the time to explain it in an inline comment), and think of documentation as a teaching tool for future developers.

tlovett1 commented 7 years ago

Great discussion all around.

@nicoladj77 I do think there could be a nice addition to the BP to the effect of "while we strive to write code that explains itself, documentation is still important..." Maybe something about updating documentation to avoid rot as well.

tlovett1 commented 7 years ago

@nicoladj77 I do think there could be a nice addition to the BP to the effect of "while we strive to write code that explains itself, documentation is still important..." Maybe something about updating documentation to avoid rot as well.

How does everyone feel about a PR on message with this?