Format the value after # =>

wooster0 commented 5 years ago

I think the value after # => should be formatted by the Crystal formatter. Currently there are a lot of cases in for example the Array documentation where the value after # => doesn't match Crystal's formatting rules at all. Example: Array#permutations:

Returns an Array with all possible permutations of size.

a = [1, 2, 3]
a.permutations    # => [[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]
a.permutations(1) # => [[1],[2],[3]]
a.permutations(2) # => [[1,2],[1,3],[2,1],[2,3],[3,1],[3,2]]
a.permutations(3) # => [[1,2,3],[1,3,2],[2,1,3],[2,3,1],[3,1,2],[3,2,1]]
a.permutations(0) # => [[]]
a.permutations(4) # => []

Crystal would format e.g. [[1],[2],[3]] to [[1], [2], [3]] if it were code.

straight-shoota commented 5 years ago

What comes after => should always be a valid expression, so we can definitely format it. If it's not an expression, don't use =>. This also makes it verifiable by https://github.com/maiha/crystal-examples

I'm not sure though if we should impose to strict rules as to fail with a syntax error when it doesn't parse. It's a comment after all...

wooster0 commented 5 years ago

No, I don't think there should be errors raised for comments. I think in the comment land you should feel safe and free of errors. The comments that don't parse should be ignored in my opinion.

straight-shoota commented 4 years ago

Now that an implementation has been proposed, I reckon we should continue discussing the strictness.

I prefer it to be strongly formalized and error when # => is not followed by a valid expression. The main benefit of this is that it helps catch errors. If there's a typo that results in valid code but not properly formatted, the formatter fixes the formatting. If the typo leads to invalid code however, it should even be more important to point that out.

While being very similar in many cases regarding primitive values, there is however a discrepancy between "valid code" and "output of p! (i.e. #inspect)". As evidenced in the example in the OP, p! [] of Int23 returns [] which itself is not a valid expression. We need to clarify what => is supposed to represent with respect to the expression immediately before it: Does it literally mean the result of #inspect? Or is it supposed to be an unambiguous and re-usable representation of the value? Or as an alternative route, should the result of #inspect fit the latter specification? I'm not sure what's best here. It would probably help to discuss this based on relevant examples. So I'd suggest to merge #8851 as it stands, ignoring syntax errors. After that we can adjust the implementation to fail on syntax errors and consider the affected examples.

I think in the comment land you should feel safe and free of errors.

I agree with the general idea behind this, but I'd argue that a comment delimited by # => should be considered a formalized construct. It's not a regular comment where you can write whatever you please. The formatter would never touch such a free form comment. But # => comments are already changed if they happen to be followed by a valid expression. If comment land was to be "safe and free of errors", that should already be considered a violation.

If you want your comments to be safe and free of errors, the solution is simply to not start it with # => because that has a special meaning and would already break your safe and free formatting.

RX14 commented 4 years ago

What comes after => should always be a valid expression

why should it? .inspect values don't have to be valid expressions. And to me, that's what # => means: this is the result of .inspecting the LHS.

watzon commented 3 years ago

I agree with @straight-shoota here, though I feel like it would probably be better to have a warning than an outright compilation error. Though I'd also be all for extending this concept to doc comments and allowing code blocks to be validated (provided they aren't explicitly in another language). This is something I've seen used in a couple languages and I really tend to like it.

straight-shoota commented 3 years ago

I've used the code from #8851 to print all offending instances from stdlib.

https://gist.github.com/straight-shoota/5588b2a1652382423e18dcf0c944d66e

It's really not that much (about 200 with a lot of similar ones). Some cases look like they might really be wrong (or sub-optimal) representations.

In this example the value is actually supposed to be a string but it's not placed in delimiters:

https://github.com/crystal-lang/crystal/blob/cef547b55c6568521e3b4e011b317bc2ef9486fb/src/http/client.cr#L28

There are also cases where the comment includes additional annotations:

https://github.com/crystal-lang/crystal/blob/cef547b55c6568521e3b4e011b317bc2ef9486fb/src/enumerable.cr#L612

I think this can be solved relatively easily by using an inside comment:

# ["Alice", "Bob"].index { |name| name.size < 4 } # => 1 # Bob's index)

This is a similar case, but here we could probably just use the actual encoded value:

https://github.com/crystal-lang/crystal/blob/cef547b55c6568521e3b4e011b317bc2ef9486fb/src/colorize.cr#L36

 # "foo".colorize(Colorize::ColorRGB.new(0, 255, 255)) # => "\e[38;2;0;255;255mfoo\e[0m"

I think the comments can be improved and IMO enforcing a strict formatting helps doing that, because it points out potential issues.

Obviously, not all values can be turned in an equivalent expression. Regex::MatchData for example doesn't even expose a public constructor. Instances are only created as a result of applying a regular expression. So it wouldn't make sense to use a faux constructor syntax that is formally valid but not semantically.

There are two options:

formatter errors are allowed to accomodate for non-expressable values
# => can't be used for non-expressable values (instead, we could use plain comments or a similar syntax, for example # >>)

Both options are valid. It would be nice to have a common syntax, but it would also be nice to have reliable formal validation. We can't have both.

Such examples are mostly meant for humans to illustrate the usage of the API. For that purpose there's no strict formal requirement. But it helps to use the same syntax and semantics everywhere. Additionally, these examples also serve as a kind of test system. They can be seen as documentation tests. And at the very least, a strict format helps to discover errors in the documentation itself.

Besides syntactical validation, there's also semantical validation provided by https://github.com/maiha/crystal-examples/ which compares the commented expression to the actual value of the code. And with that in mind, it might be acceptable to ignore formatter failures on the basis that if it doesn't format, it at least can be verified to match the inspect output.

crystal-lang / crystal

Format the value after # => #7272