lsegal / yard

YARD is a Ruby Documentation tool. The Y stands for "Yay!"
http://yardoc.org
MIT License
1.94k stars 397 forks source link

No warning when overriding namespace documentation #1173

Closed dylanahsmith closed 6 years ago

dylanahsmith commented 6 years ago

Problem

We have documented top-level namespaces like

# description of namespace Sales
module Sales
end

which can easily be accidentally overridden by nested classes/modules when the documentation gets mistakenly put at the top-level instead of directly preceding the nested class. E.g.

# description of class Sales::OrderQuery
module Sales
  class OrderQuery
  end
end

The second documentation of the top-level module overrides the other one without any warning, such that we can't catch this in CI by using yardoc with the --fail-on-warning option.

Steps to reproduce

  1. Save the above code snippets to sales.rb and sales/order_query.rb
  2. Run yardoc --fail-on-warning sales.rb sales/order_query.rb
  3. See that there are no warnings in the output
  4. Additionally, echo $? outputs the success status 0
  5. Open open doc/Sales.html and see that the overview has "description of class Sales::OrderQuery"

I would expect a warning letting me know that the documentation is being overridden along with file and line information for the two places where the documentation is provided so one of them can be moved/removed or so they could be merged.

What is Happening in the Code

The docstring for the YARD::CodeObjects::ModuleObject is being replaced in YARD::Handlers::Base#register_docstring when the ModuleObject is registered for the second module in YARD::Handlers::Ruby::ModuleHandler

Proposed Solution

log.warn could be placed before object.docstring = parser.to_docstring in YARD::Handlers::Base#register_docstring inside an if object.docstring && !object.docstring.empty? condition. However, that trivial change would cause warnings when using yardoc with the --use-cache option, since it just applies the old documentation to the cached code objects in the registry.

I propose that this warning would only be logged either when

  1. the --use-cache option isn't being used
  2. the object argument passed to register_docstring wasn't loaded from the cache

It doesn't look like either of this information is properly exposed to the YARD::Handlers::Base object that register_docstring is called on. In the case of option 1, I think that could be exposed through a method on the registry. In the case of option 2, this state could be stored on the YARD::CodeObjects::Base objects either before serializing them to store them in the cache or after deserializing them from the cache. This seems like the least clear choice to make before moving forward on adding the warning.

Long term, I think https://github.com/lsegal/yard/issues/1165 needs to be addressed so that this is easier to address when using the --use-cache option. At that point, it should be easier to distinguish between the object argument to register_docstring having an existing docstring because it was defined elsewhere (where we want a warning) or because it was loaded from the cache (where we don't want the warning).

lsegal commented 6 years ago

I think this would create a breaking change in expected behavior for those using --fail-on-warning (or those who use tooling to read from warnings), so I'm not sure this could be done in core.

I believe this is something you could create a plugin or extension for without needing any modification of core. Specifically, you can use YARD::DocstringParser.after_parse to determine if you've duplicated a docstring and emit a warning; you would have to keep a manual tally, but it's doable.

dylanahsmith commented 6 years ago

I would expect that those who use --fail-no-warning would want to be able to catch documentation problems like this. Are you referring to https://github.com/lsegal/yard/issues/1007 and yard-junk gem in that it came out of the desire to ignore undocumentable errors? Since warnings like these don't seem like they would fall into the category of warnings that some people might want to ignore.

However, if you are worried about backwards compatibility, then perhaps it could be disabled by default but have a configuration that could turn on these warnings. That way the default could just be switched on the next release that makes breaking changes. The configuration could also be removed entirely on that release with breaking changes, unless you see a use case for not having these warnings other than backwards compatibility.

Thanks for pointing me in the right direction working around this. I hadn't looked into that yet since I expected this to be desirable to be fixed as part of core.

lsegal commented 6 years ago

I would expect that those who use --fail-no-warning would want to be able to catch documentation problems like this.

Documentation problems, yes-- but like this is less clear. There are legitimate usages for re-documenting the same module or class across multiple files, for instance, if you separate multiple packages as individual gems for release but generate separate documentation across all sub-packages at once. Users who use YARD in this way would start seeing failures where they previously had none.

YARD historically hasn't been directly hands on about enforcing best practices or documentation style and instead left these types of linting tools up to downstream libraries like yardstick et al, which can be more focused on solving those specific problems. Warnings in YARD are typically only emitted in cases where YARD is hindered from doing its own job of parsing or generating documentation, with the arguable exception of warnings for parameter names. In this specific case, though, YARD isn't really hindered from generating docs, and it's not even a question of undocumented code, since it is in fact documented. You'll notice that YARD also doesn't warn about duplicate tags, and other dupe cases where it could, but YARD itself isn't the linter.

For that reason I believe you would have a lot more control over checking this type of use case via plugin or extension.

dylanahsmith commented 6 years ago

Ok, thanks for the added context. I'll close it sounds like it is considered out of scope for this project.

dylanahsmith commented 6 years ago

I can confirm that YARD::DocstringParser.after_parse can be used to solve this problem outside of the yard gem.

In case someone else comes across the same problem, I'm currently loading the following script in yardoc using the --load option to log these warnings.

# Prevent docstrings from being overridden
#
# This is particularly a concern for top-level namespaces,
# since a file defining a constant under the namespace could
# override the top-level namespaces documentation by putting
# that constant at the top of the file rather than preceding
# the nested constant defined by that file.
documented_objects = {}
YARD::DocstringParser.after_parse do |parser|
  next if (parser.raw_text || '').empty?
  object = parser.object
  next unless object && parser.handler
  previous_file = documented_objects[object]
  current_file = parser.handler.parser.file
  if previous_file
    YARD::Logger.instance.warn(
      "Documentation for #{object.type} #{object.path} from #{previous_file} " \
      "is being replaced with one from one from #{current_file}"
    )
  end
  documented_objects[object] = current_file
end