YARD interprets monkey-patching through modules as a documentation error

doudou commented 2 years ago

A technique to better scope monkey patches is to define the new method(s) in modules and include/prepend/extend the module in the target class (depending on what one wants to do)

Currently, YARD interprets this as an ordering issue with [warn]: Load Order / Name Resolution Problem on SomeExternalClass: and reports an unhandled exception.

I understand the drive for this behavior, but it would be nice to at least have a way to locally disable it through a comment.

[error]: Unhandled exception in YARD::Handlers::Ruby::ExtendHandler:
  in `bla.rb`:4:

        4: SomeExternalClass.extend M

[error]: ProxyMethodError: Proxy cannot call method #mixins on object 'SomeExternalClass'
[error]: Stack trace:

Steps to reproduce

Run yard on

module M
end

SomeExternalClass.extend M

Actual Output

[debug]: Parsing ["bla.rb"] with `ruby` parser
[debug]: Parsing bla.rb
[warn]: Load Order / Name Resolution Problem on SomeExternalClass:
-
Something is trying to call mixins on object SomeExternalClass before it has been recognized.
This error usually means that you need to modify the order in which you parse files
so that SomeExternalClass is parsed before methods or other objects attempt to access it.
-
YARD will recover from this error and continue to parse but you *may* have problems
with your generated documentation. You should probably fix this.
-

[error]: Unhandled exception in YARD::Handlers::Ruby::MixinHandler:
  in `bla.rb`:4:

        4: SomeExternalClass.include M

[error]: ProxyMethodError: Proxy cannot call method #mixins on object 'SomeExternalClass'
[error]: Stack trace:
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/code_objects/proxy.rb:194:in `rescue in method_missing'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/code_objects/proxy.rb:191:in `method_missing'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/ruby/mixin_handler.rb:37:in `process_mixin'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/ruby/mixin_handler.rb:12:in `block (2 levels) in <class:MixinHandler>'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/ruby/mixin_handler.rb:10:in `each'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/ruby/mixin_handler.rb:10:in `block in <class:MixinHandler>'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/processor.rb:114:in `block (2 levels) in process'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/processor.rb:112:in `each'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/processor.rb:112:in `block in process'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/processor.rb:111:in `each'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/processor.rb:111:in `each_with_index'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/handlers/processor.rb:111:in `process'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/parser/source_parser.rb:497:in `post_process'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/parser/source_parser.rb:443:in `parse'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/parser/source_parser.rb:46:in `block in parse'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/logging.rb:82:in `capture'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/parser/source_parser.rb:45:in `parse'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/parser/source_parser.rb:371:in `parse_in_order'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/parser/source_parser.rb:114:in `block in parse'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/logging.rb:182:in `enter_level'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/parser/source_parser.rb:113:in `parse'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard.rb:20:in `parse'
        /home/doudou/.local/share/autoproj/gems/ruby/2.7.0/gems/yard-0.9.27/lib/yard/cli/yardoc.rb:259:in `block in run'

Expected Output

Simply ignore the line.

Environment details:

OS: Ubuntu 20.04
Ruby version (ruby -v): ruby 2.7.5p203 (2021-11-24 revision f69aeb8314) [x86_64-linux-gnu]
YARD version (yard -v): yard 0.9.28

I have read the Contributing Guide.

lsegal commented 2 years ago

While this is probably a frustrating warning (and error) to see when it is out of your control, there are a few good reasons why you would actually want to see them, and a couple of issues that keep us from reliably solving this. Without going into too much detail here, the tl;dr is that we cannot guarantee that SomeExternalClass is actually external, and YARD is always optimized for providing consistent reliable docs over the path of least resistance. Any change that would cause YARD to generate a false negative and ignore a warning where docs were incorrectly generated (i.e. it missed a real mixin because file processing was legitimately out of order) would be seen as a regression in behavior.

Some kind of YARD directive comment could help here, but it would have to be fairly focused on ignoring this specific warning rather than an outright disabling of YARD processing altogether. Ideally it's something where you would identify a module by name for external access, i.e.

# @!external SomeExternalClass
SomeExternalClass.extend M

There are no plans to add this directly, and the policy on this issue tracker is to not accept Feature Requests as issues (because they tend to just sit), however, this would be accepted as a PR if someone wanted to work on it.

Alternatively it would be fairly easy to implement this externally as a YARD plugin or local extension in your specific codebase via a YARD::Tags::Directive subclass:

class ExternalDirective < YARD::Tags::Directive; def call; end end
YARD::Tags::Library.define_directive :external, ExternalDirective

🤫 don't tell anyone but if you do nothing in your directive it just ignores processing of the attached code. YARD's core version of this would have to do something more specific since we wouldn't directly support directives that disable processing of blocks.

Place the above in your codebase (yard_ext.rb) and run with yard -e yard_ext.rb

If you're not interested in custom directives, the following workaround would also work, but note that it may stop working in the future(*)-- though if it were to stop working, the above or some equivalent would (ideally) also be implemented to compensate:

SomeExternalClass.send(:extend, M)

It's certainly not ideal, but it will cause YARD to ignore the statement (again, for now).

(*) it would stop working because YARD reserves the right to start processing .send(LITERAL, args) type statements, which would allow us to catch a whole host of currently-ignored statements. There are no direct plans to implement this any time soon but there are also no guarantees that it might not happen in the very next release either.

Hope some of this helps!

doudou commented 2 years ago

Hi @lsegal. Thanks a lot for the very detailed answer. Very helpful.

Out of curiosity, could you sketch what would be a good path to implement the directive properly inside Yard ?

lsegal commented 2 years ago

There are a couple of ways to do it, one would be to register a temporary ClassObject for that given namespace, and then remove it in the SourceParser#after_parse_list callback (at the end of parsing). Since it's "external", it should be safe to do this. A directive like that might look like (untested):

class ExternalDirective < YARD::Tags::Directive
  class << self
    attr_reader :objects
  end

  @objects = []

  def call
    # we create a "class" object since it allows this object to also be used as an inherited superclass, etc
    self.class.objects << YARD::CodeObjects::ClassObject.new(:root, tag.text)
  end
end

YARD::Parser::SourceParser.after_parse_list do |files, globals|
  ExternalDirective.objects.each { |obj| YARD::Registry.delete(obj) }
  ExternalDirective.objects.clear
end

YARD::Tags::Library.define_directive(:external, ExternalDirective)

As an important note here, I was actually wrong about my "if you do nothing in the directive it ignores the block" statement. I incorrectly tested the directive. Sorry about that. This actually means the directive above needs to be separated from the block its used on due to tag parsing order, namely, the Ruby code is parsed prior to directives attached to a block, so the exception will be raised prior to the directive being seen. To fix this you would need to place the directive anywhere before the line in question:

# top of the file works
# @!external SomeExternalClass

# attached to M works
# @!external SomeExternalClass
module M
end

# even as a lone comment block (needs 2 empty lines between the include call, not just 1)
# @!external SomeExternalClass

SomeExternalClass.extend M

# you can also put it in a file that is parsed prior to this one, like your `lib/mylib.rb` file that parses before `lib/mylib/etc...` files.

Even with the above quirks, that would probably be the easiest methodology, but may generate some weird side effects if that external namespace were to be used in non-standard ways (like mixed into a non-external class) within said block.

From a dev experience POV, it would probably be easiest used in the context of actually defining the class in Ruby code, i.e.:

# mark this class as external so YARD ignores it from doc generation
# @!external SomeExternalClass
class SomeExternalClass; end

Putting the above in the same lib/mylib.rb file would register the "real" Ruby class as external. With this syntax you could also modify the directive to not require a parameter name, since it could be parsed from the code block via the #handler attribute. Note that parsing the block also lets you know whether to define the object as a class or module (or other type of object) in the registry, which may matter in certain cases. The above directive implementation in this comment is certainly naive in only defining objects as external classes, which could cause problems under certain conditions. This however comes with the cost of extra complexity in the directive, since you would have to parse that out of the handler.statement directly.

doudou commented 2 years ago

Actually, I'm thinking about doing the last bit in a separate file, that will be parsed only by yard (something like externals.rb in my doc/ folder). That would make sure it does not interfere with the actual code, and would allow me to document it.

It's essentially the directive you sketched, but without the need for a directive, and keeping the result. Not sure whether you believe the # @!external directive would be better (assuming I get the time to implement it)

I think keeping the result is also a good thing, actually. The issue with the directive is that it would need to be very flexible. In theory, extend could be called even on constants that are neither classes nor modules ...

Mustafatarakci commented 2 years ago

``

doudou commented 2 years ago

Going to close this ... I think it's all good clear.

lsegal / yard