domluna / JuliaFormatter.jl

An opinionated code formatter for Julia. Plot twist - the opinion is your own.
https://domluna.github.io/JuliaFormatter.jl/dev/
MIT License
576 stars 68 forks source link

Full YASGuide implementation #198

Open ararslan opened 4 years ago

ararslan commented 4 years ago

Here are some (hopefully complete) to-dos to make the experimental YASGuide implementation complete, for whomever wishes to work on it. This is the style guide my company uses, so we'd likely adopt JuliaFormatter if it supported it. This guide has a significant amount of overlap with BlueStyle, so adding the functionality outlined below would also increase the ability to support other popular guides. (Indeed, all but a couple of these apply to both YAS and Blue.)

domluna commented 4 years ago

@ararslan which of these is preferable

  1. fit as much on the same line
comp = [a * b + c for a = 1:10, # comment
        b = 11:20, c = 300:400]
  1. break all lines if there if a comment
comp = [a * b + c for
        a = 1:10, # comment
        b = 11:20,
        c = 300:400]
ararslan commented 4 years ago

That's a good question, I don't think I've ever seen a comment in the middle of a comprehension. The YASGuide itself definitely doesn't have anything specific to say on this. I'd probably be inclined to prefer the latter just because I would find it visually odd to separate loops that way, but also having the dangling for on the first line feels weird. @jrevels, thoughts?

I guess a third option is

comp = [a * b + c
        for a in 1:10,  # comment
            b in 11:20,
            c in 300:400]

but that indentation feels unsatisfyingly arbitrary.

jrevels commented 4 years ago

Yeah, I don't think the YASGuide mandates anything here, but in my own code, I think I tend to prefer @ararslan's third option.

For complicated multiline comprehension bodies, I'll sometimes use begin...end blocks, e.g.:

comp = [begin
            x = a * b + c
            y = x^2 + 3x # comment 1
        end
        for a in 1:10,  # comment 2
            b in 11:20,
            c in 300:400]

I haven't thought too hard about this preference though 😁

ararslan commented 4 years ago

It's no longer possible to use begin/end in comprehensions as of 1.4, since x[begin] is a thing (mirrors x[end]). You can use let/end I think though.

StefanKarpinski commented 4 years ago

It's allowed in untyped comprehensions like this one, but not in typed comprehensions.

ararslan commented 4 years ago

Ah, gotcha.

domluna commented 4 years ago

I would find it visually odd to separate loops that way, but also having the dangling for on the first line feels weird

should this be formatted differently then?

https://github.com/beacon-biosignals/Onda.jl/blob/master/examples/tour.jl#L61-L62

ararslan commented 4 years ago

I'd say so, good catch :smile:

domluna commented 4 years ago

@ararslan

Rewrite function definitions inside of other function definitions to be lambdas

can you give an example?

domluna commented 4 years ago

Rewrite import X to using X

@StefanKarpinski are there a cases where import would be preferred instead of using?

StefanKarpinski commented 4 years ago

If you want to add methods to the binding without qualification then import is required. So, suppose you have this:

module A
    export f
    function f end
end

Then this doesn't work:

julia> module B
           using ..A: f
           f() = 1
       end
ERROR: error in method definition: function A.f must be explicitly imported to be extended
Stacktrace:
 [1] top-level scope at none:0
 [2] top-level scope at REPL[2]:3

Whereas this does:

julia> module B
           import ..A: f
           f() = 1
       end
Main.B

I don't particularly like this distinction and argued against it pre-1.0, but there you have it. If, however, you use qualified names to extend imported functions then you can always use using:

julia> module B
           using ..A: A, f
           A.f() = 1
       end
WARNING: replacing module B.
Main.B

Note, however, that if you're explicitly listing the names to include, you have to list A itself or you won't be able to write A.f inside of B. So this whole thing is a bit of a mess.

domluna commented 4 years ago

hmmm ok, I was wondering whether I should make that conversion the default but it looks like it could break code in some cases

StefanKarpinski commented 4 years ago

There's definitely a transformation that could be done, but it's more annoying and non-local than one would want. The normalization that would be safe is this:

So yeah, that's a very annoying and fussy normalization process just to unify on using. It's simpler if you go the other direction and transform using into import:

I think that should never break code. And of course, you'd want to combine import statements that come from the same module. I do also think that it's good style to fully qualify extensions to imported methods just to be 100% clear that's what you're doing.

ararslan commented 4 years ago

hmmm ok, I was wondering whether I should make that conversion the default but it looks like it could break code in some cases

For the purposes of YASGuide compliance, import A could become using A: A, which is what we've been doing when we need that particular behavior. So that would be the most conservative change the formatter could make for compliance.

ararslan commented 4 years ago

Rewrite function definitions inside of other function definitions to be lambdas

can you give an example?

function f(x)
    function g(y)
        # do whatever
        return thing
    end
    return something
end

should instead be

function f(x)
    g = y -> begin
        # do whatever
        return thing
    end
    return something
end
jrevels commented 4 years ago

I do also think that it's good style to fully qualify extensions to imported methods just to be 100% clear that's what you're doing.

Yup, this is a requirement of the YASGuide anyway:

When overloading a function from another module, the function name should be qualified with its module (e.g. imported_function(...) = ... is bad, ParentModule.imported_function(...) = ... is good).

domluna commented 4 years ago

sample formatting so far https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt

julia> using JuliaFormatter

julia> format(".",
           style = YASStyle(),
           always_for_in = true,
           whitespace_ops_in_indices = true,
           whitespace_typedefs = false,
           remove_extra_newlines = true,
           import_to_using = true,
           pipe_to_function_call = true,
           short_to_long_function_def = true,
           margin=92
       )
StefanKarpinski commented 4 years ago

Requiring

function f(x)
    g = y -> begin
        # do whatever
        return thing
    end
    return something
end

prevents defining inner functions that have multiple methods. Seems like an overreaction to the fact that putting method definitions in the branches of a conditional doesn't do what one expects.

jrevels commented 4 years ago

prevents defining inner functions that have multiple methods. Seems like an overreaction to the fact that putting method definitions in the branches of a conditional doesn't do what one expects.

Yeah, Eric Davies and I had a good chat about this rule in Slack a while back; looks like history has been erased by now though 🤦‍♂should've saved it.

To provide a concrete example in the vein of what you said, the rule is there in an attempt to make it less likely for people to confuse themselves by doing stuff like

julia> f(::Int) = 1
f (generic function with 1 method)

julia> function g(x::T) where {T}
           f(::T) = 2
           f(x)
       end
g (generic function with 1 method)

julia> g(1.0)
2

julia> f(1.0)
ERROR: MethodError: no method matching f(::Float64)
Closest candidates are:
  f(::Int64) at REPL[1]:1
Stacktrace:
 [1] top-level scope at REPL[4]:1

...where a lot of newcomers will get confused/frustrated by the fact that there wasn't some persistent top-level method overload. Not that that's a reasonable expectation on their part, just a common mistake I've seen folks trip up on.

IIRC the conclusion of the chat with @iamed2 was that inner method overloads were useful enough as a feature that this rule was deemed overly defensive and not worthwhile for BlueStyle. In practice, I think you can often (but maybe not always? haven't thought about it) refactor actual usages of this feature to instead use explicitly callable types instead of inner methods. My preference for that is much more subjective and less substantiated though. This isn't a super critical rule for me, and I could be convinced to change it; for example, if we ever went through the massive bikeshed required to merge YASGuide and BlueStyle, this would be a rule I'd easily give up for the sake of compromise.

ararslan commented 4 years ago

sample formatting so far beacon-biosignals/Onda.jl@master...domluna:test-yasfmt

This actually introduces a few guide violations:

  1. Should have a return: https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt#diff-26bf7f3fc584320d0e91db14601713bfR42 (few other places as well)
  2. Space around = here was added, should have been untouched: https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt#diff-26bf7f3fc584320d0e91db14601713bfR48

Couple of "not sures", would be good to have @jrevels chime in on them:

domluna commented 4 years ago

Seems like this probably should have stayed on the same line with the if wrapped instead of the for? beacon-biosignals/Onda.jl@master...domluna:test-yasfmtdiff-297becf209b58cadeab45c3274a77911R163

bug

Space around = here was added, should have been untouched: beacon-biosignals/Onda.jl@master...domluna:test-yasfmtdiff-26bf7f3fc584320d0e91db14601713bfR48

bug

Should have a return: beacon-biosignals/Onda.jl@master...domluna:test-yasfmtdiff-26bf7f3fc584320d0e91db14601713bfR42 (few other places as well)

not implemented yet

ararslan commented 4 years ago

Thanks for confirming :slightly_smiling_face:

domluna commented 4 years ago

@ararslan

For the following

     function foo()
           if true
               10
           else
               20
           end
       end

would you want return inserted as well?

     function foo()
           return if true
               10
           else
               20
           end
       end
ararslan commented 4 years ago

Yep 👍

domluna commented 4 years ago

Made more improvements:

https://github.com/jrevels/Cassette.jl/compare/master...domluna:yasfmt https://github.com/beacon-biosignals/Onda.jl/compare/master...domluna:test-yasfmt

domluna commented 4 years ago

I'm going to clean up the draft branch and merge it in since there's non YAS specific improvements in there as well.

Any other additions / fixes will follow. Also, I'm going to hold off on the last 2 items on this list for now - there's a couple other items I'd like to investigate in the meantime.

domluna commented 4 years ago

This is part of v0.4 now

ericphanson commented 4 years ago

I just had a go at running JuliaFormatter on a small non-open source YASStyle repo and it worked pretty well, as vetted by @ararslan! Just ran into two things, first I think the method

function p_kw(style::YASStyle, cst::CSTParser.EXPR, s::State)
    t = FST(cst, 0)
    for a in cst
        add_node!(t, pretty(style, a, s), s, join_lines = true)
    end
    return t
end

taken from the example https://domluna.github.io/JuliaFormatter.jl/dev/custom_styles/ should be added for YASStyle, since not having space around the = in keyword arguments is part of the style (point 2.2 of https://github.com/jrevels/YASGuide#linealignmentspacing-guidelines).

The second is that

function f()
    for long_variable_name in ("Fusce eu justo at purus finibus sagittis.",
                               "Nulla egestas magna vitae lacus.",
                               "Aenean finibus nisl at magna feugiat finibus.",
                               "Etiam sodales ligula a hendrerit efficitur.",
                               "Vestibulum sed lorem vel massa consequat.",
                               "Nulla ut turpis pretium, sollicitudin.")
        println(long_variable_name)
    end
    return 1
end

is YAS-compliant but gets undesirably formatted into

function f()
    for long_variable_name in
        ("Fusce eu justo at purus finibus sagittis.", "Nulla egestas magna vitae lacus.",
         "Aenean finibus nisl at magna feugiat finibus.",
         "Etiam sodales ligula a hendrerit efficitur.",
         "Vestibulum sed lorem vel massa consequat.",
         "Nulla ut turpis pretium, sollicitudin.")
        println(long_variable_name)
    end
    return 1
end

I'm not really sure what rule could be made but it's better to have the list start on the first line rather than have 2 items on the second line.

StefanKarpinski commented 4 years ago

Not sure about YAS, but it seems to me that the preferred formatting of the latter example would be:

function f()
    for long_variable_name in (
            "Fusce eu justo at purus finibus sagittis.",
            "Nulla egestas magna vitae lacus.",
            "Aenean finibus nisl at magna feugiat finibus.",
            "Etiam sodales ligula a hendrerit efficitur.",
            "Vestibulum sed lorem vel massa consequat.",
            "Nulla ut turpis pretium, sollicitudin.",
        )
        println(long_variable_name)
    end
    return 1
end
domluna commented 4 years ago

@ericphanson

for the 1st point you can set whitespace_in_kwargs to false (defaults to true). Updated the docs to show this, it was only shown in the docstring of YASStyle before.

For the 2nd point it's nesting/breaking on in. If this is undesirable for YAS, which it might be due to how the rest of YAS is formatted, the nesting rules can changed.

ericphanson commented 4 years ago

Ah, thanks for updating the docs. I actually didn’t realize that those options weren’t set automatically by choosing style=YASStyle(). Maybe that can be a feature request? (Keyword argument defaults chosen by the style).

For the second point, I don’t know if YAS is specific enough about exactly how nesting should work but based on the example, maybe the rule could be the list should start on the first line unless the length exceeds 92 chars? Or if each entry (any entry?) is longer than X amount they should each be on their own lines?

@StefanKarpinski I like that style too since it’s easy to add/remove items without touching other lines. I don’t think all users of the YASGuide necessarily agree (I think @ararslan doesn’t prefer it?) but I also don’t see anything in the guide itself one way or another about it, besides that the arguments should line up, as they do in your example.

domluna commented 4 years ago

Maybe that can be a feature request? (Keyword argument defaults chosen by the style).

this was originally how it was but then there were requests to have spaces between keyword arguments but keep yas style indentation so it was split out into an option.

What a style has governance over has changed over time but at the moment the styles handle indentation and nesting (line breaking). Of course there's nothing stopping a style from handling more than indentation and nesting but given people have different preferences (even for the small things) decoupling these things as much as possible is the way to go IMO.

ericphanson commented 4 years ago

there were requests to have spaces between keyword arguments but keep yas style indentation so it was split out into an option.

Couldn’t that happen by the style changing what the default choices are but not overriding the options altogether? Eg that could be enabled by using YASStyle + a keyword argument.

Definitely makes sense to me to have options but it also seems like it would be good if choosing style=YASStyle() was enough to have a YAS-complaint formatting and then any deviations from that could be opt-in by keyword arguments.

Not a big deal either way though!

domluna commented 4 years ago

The options part of the style I suppose, then the defaults could be different for each style

jrevels commented 4 years ago

@StefanKarpinski I like that style too since it’s easy to add/remove items without touching other lines. I don’t think all users of the YASGuide necessarily agree (I think @ararslan doesn’t prefer it?) but I also don’t see anything in the guide itself one way or another about it, besides that the arguments should line up, as they do in your example.

@ericphanson it's technically disallowed by the current rule, ref https://github.com/jrevels/YASGuide/issues/16 for discussion/bikeshed on that one

ericphanson commented 4 years ago

Ah, thanks for the clarification.

ericphanson commented 4 years ago

With style=YASStyle() and the recommended keyword arguments for YAS compliance, I find

open("table.txt", "w") do io
    println(io, table)
end

gets reformatted into

open("table.txt", "w") do io
    return println(io, table)
end

which doesn't seem right to me. I could put an explicit return nothing instead, but I wonder if do-blocks should be treated as short-form functions? Maybe needs a YASGuide ruling (cc @jrevels)

ararslan commented 4 years ago

AFAIK the transformation there is correct per the YASGuide.

domluna commented 4 years ago

the return thing doesn't always work as one might like on some edge cases right now though https://github.com/domluna/JuliaFormatter.jl/issues/233

nickrobinson251 commented 3 years ago

Require ; to separate positional and keyword arguments

FYI BlueStyle does this... so i imagine it'd be fairly easy to have YASStyle do it too :) (was added by Dom in https://github.com/domluna/JuliaFormatter.jl/pull/295, and used in p_call)

e.g. (using JuliaFormatter v0.10.9)

julia> str = "foo(1, 2, a=3)";

julia> println(format_text(str, YASStyle()))
foo(1, 2, a=3)

julia> println(format_text(str, BlueStyle()))
foo(1, 2; a=3)