Documentation blocks should be placed on top of the entity they describe

fsareshwala commented 1 year ago

Currently, documentation blocks are placed underneath the entities they describe. This is quite confusing and the opposite of most commenting conventions in various programming languages. I think it makes sense to have a separate symbol for documentation blocks (e.g. # for comments and -- for documentation blocks). However, the position should remain on top of the entity they describe.

I think the one exception here would be to have a documentation block within the top of a struct, enum, bits, etc, kind of like a docstring in python. However, enabling that one exception may make the implementation more difficult so we can default to the top of the entity instead, imo.

reventlov commented 1 year ago

I see your point, and I did, indeed, model Emboss's documentation blocks off of Python's docstrings.

I see four ways to possibly meet this proposal, but they all have very unfortunate drawbacks:

-- Doc blocks are now universally before the entity they describe.
-- Backwards-incompatible, which means that it is not viable.
struct Example:
  ...

[doc_position: "before"]
-- Doc block position is now determined by a pragma.
-- *   Technically backwards-compatible
-- *   Adds quite a bit of complexity to parsing
-- *   Readers need to know the `doc_position` mode for the file
struct Example:
  ...

-- Doc block position is determined by indentation.
struct Example:
  -- Unfortunately, the first doc at a particular indentation is ambiguous.
  0 [+1]  UInt  field
  ...

/// Pre-docs use a different sigil than post-docs.
/// *   Unambiguous
/// *   Adds Yet Another thing for readers to know and pay attention to
/// *   Not many good sigils are available -- maybe Javadoc triple-`/`
struct Example:
  ...

Of those, I think the last one (using a different sigil) comes closest to viable, but I'm reluctant to allow multiple semantically-identical forms. If you have other ideas, I'm happy to hear them.

fsareshwala commented 1 year ago

If we allow documentation blocks to be on both top and bottom of entities (even with different monikers), I think it will be even more confusing than just having documentation blocks on the bottom. There should be one place rather than two. I know setting documentation blocks universally to the top, before the entity they describe, seems backwards incompatible. However, I think we need to view this issue as a bug rather than a feature enhancement. As a bug, we don't need to worry about backwards incompatibility as much.

More practically, I would argue that it's okay to change this in order to prevent further confusion. Since no actual user logic will change (just documentation), it won't lead to any code breakages but may be a slight surprise. For that, we can make a bold notice in the release notes? Basically, I'm advocating that we rip the bandage off early, now, rather than let this continue to cause confusion.

Hope that all makes sense. Let me know what you think. Happy to discuss more on it in real time if that is easier :). Also happy to work on this bug too in order to increase my knowledge of the Emboss source code :D.

reventlov commented 1 year ago

In that case, my answer is no.

Any change, whether it's labelled "bug" or "feature request," needs to not break extant .emb files unless absolutely necessary. This is the policy that I have maintained, and I have explicitly promised it to multiple teams. It is about three years too late for any more "rip off the bandage" kinds of changes.

This change would break pretty much every single extant .emb file, for essentially aesthetic reasons. Doc comments are part of the Emboss syntax, and there are a bunch of positions that would no longer be valid (e.g., at the end of a file or between the first line of an entity and any [attributes]) -- and that's even if we loosened the indentation requirements.

fsareshwala commented 1 year ago

Ah, I see. I wasn't aware that there were such tight restrictions already. In that case, definitely we should go with a less nuclear option. Perhaps we use your last suggestion where we use a different moniker/sigil for predocs. The only concern I have is that we already have # for regular comments and -- for doc comments. As you mention, adding more symbols means more that readers have to be aware of.

Thinking about this a bit, I think a reasonable path forward that will maintain backwards compatibility as well as unify the commenting environment in Emboss is:

Introduce /// for doc comments
Introduce // for regular comments
Deprecate -- for doc comments
Deprecate # for regular comments

I suggest we introduce // for regular comments since it plays really nicely with the /// for doc comments. Just one extra slash and a regular comment becomes a doc comment, should users want to make such a switch. Then, with the deprecation of -- and #, we can suggest users to use /// and // for all new comments while old .emb files continue to work properly. We can mark that those commenting sigils are deprecated and new code shouldn't use them (in the documentation). At some point in the future also, we can also even drop mention of # and -- from the documentation so that the support is purely for compatibility purposes.

Would this path forward be workable? Alternatively, we can simply add the new /// sigil for predocs and ask all new predocs to be written using ///. The -- sigil will be kept for compatibility reasons for old .emb files. Going this route leaves an awkward # for regular docs and /// for predocs. While it would work, I think it would be cleaner to use // for regular docs and /// for predocs.

google / emboss

Documentation blocks should be placed on top of the entity they describe #63