rzk-lang / sHoTT

Formalisations for simplicial HoTT and synthetic ∞-categories.
https://rzk-lang.github.io/sHoTT/
44 stars 12 forks source link

Finish draft of the style guide #94

Closed fredrik-bakke closed 12 months ago

fredrik-bakke commented 1 year ago

Hey everyone!

First, let me say that I'm very excited to see this project picking up so much steam.

I'm posting this pull request to declare my intent to finish the draft of the style guide, but it also seems appropriate to open up for discussion given all the new contributors.

Why a style guide matters

I've noticed somewhat of a reversal in trend when it comes to style practices in this repository. This is very understandable given the state of the style guide and my general absence as the main driving force behind that style guide. However, I would like to see this project succeed in becoming a large and useful resource for learning and doing simplicial homotopy type theory, and I believe adherence to a good style guide is important in that endeavor.

For instance, as the library scales, it is important to keep navigability and maintainability in mind. Having a good and somewhat rigid naming convention helps in multiple regards

  1. The name will serve as a short and efficient summary of the entry.
  2. It will be easier to find definitions.
  3. It will be possible to refer to definitions by guessing their names instead of looking them up every time.
  4. The name will help elucidate the argument that is made when appearing in a proof, and do so efficiently.
  5. You won't have to go back and change the name (as often) in the future.

Objectives

My goal with this PR is to get the style guide to a point where

Especially, regarding this last point, I would like to see the community take complete ownership of its conventions. I would personally love to also make some mathematical contributions to this library, and not be stuck perpetually refactoring and consulting on coding style. Of course, I am always open to taking questions, but I'm hoping we can all come to a more common understanding of the library's style principles.

Again, if you have suggestions, disagree with some of the principles or want to change them in some way, or have any questions, I hope we can open a discussion on that here.

Some references

For new contributors joining in the discussion, let me link to some of our previous discussions developing the style guide:

I can also link to the current style guide of agda-unimath, from which the current style principles take most/all of their inspiration:

TashiWalde commented 1 year ago

I hope this is the right place to contribute my two cents on the style guide as I understand it.

1)

I have no strong opinions on the naming convention for terms and types. The current conventions seem very reasonable and I expect to have no big trouble following them after getting used to them some more.

Maybe only one comment (weakly felt): As I understand it, the naming convention heavily emphasizes the input and output of a construction/theorem. While this is appropriate in many cases, I think there are also situations where the process is the most important aspect that deserves to be emphasized. Or sometimes this process has a more recognizable/standard name that goes beyond the input and output types. For an extreme example, the canonical function (A -> (B -> C))->(A times B -> C) might (according to a strict reading of the convention) be called something like map-from-product-map-to-maps, while I think it should just be called uncurry.

2)

I feel very positively about the conventions for dealing with parentheses and indentations. I had never seen this particular style before, but after seeing it it just makes so much sense. A very pleasant discovery!

There is, however one small but significant change for which I would like to strongly advocate: Put binary operators at the beginning of the line rather than the end; especially if they consist of a single character.

Consider the following two snippets:

( moderately-long-name-of-a-function-A
  ( a-very-complicated-term-with-a-very-long-name-or-description-B ,
    another-long-and-complicated-expression-for-another-term-C))
( moderately-long-name-of-a-function-A
  ( a-very-complicated-term-with-a-very-long-name-or-description-B
  , another-long-and-complicated-expression-for-another-term-C))

Our indentation convention has the beautiful feature that one can determine the structure of the code simply by scanning the left edge of the code block. The second snippet above complements this neatly, since I can immediately see that function-A is applied to the pair (term-B, term-C) by scanning the left edge of the code. In contrast, note how in the first snippet my eye has to move to the very end of the line to see the comma; this means that I might easily misread this code as (function-A (term-B term-C)), completely changing its meaning! This is hightened by the fact that the comma "," is such a small symbol that can easily blend in and be overlooked at the end of a long expression.

Similarly, consider the difference between

( ( a-long-and-complicated-arithmetic-expression-involving-many-operations) *
  ( another-complicated-and-long-expression-with-symbols-and-numbers))

and

( ( a-long-and-complicated-arithmetic-expression-involving-many-operations)
* ( another-complicated-and-long-expression-with-symbols-and-numbers))

In the second case I can immediately see that I am multiplying (as opposed to, for example, adding) the two long expressions together. In the first one my eye has to travel all the way to the end of the line to know this. Over the course of a long code block, this means that my eye has to constantly travel left to right instead of just being able to scan the left edge of the code.

On a related note, I would advocate (but much less strongly) that we should declare function terms as

#def function
  ( parameter-1 : type-1)
  ( parameter-2 : type-2)
  : type-with-a-name-3
  → type-with-a-longer-name-4
  → short-type-5
  := undefined

rather than

#def function
  ( parameter-1 : type-1)
  ( parameter-2 : type-2)
  : type-with-a-name-3 →
    type-of-with-a-longer-name-4 →
    short-type-5
  := undefined

In this case there is not really any ambiguity, since is really the only operator that could possibly appear here. So it is mainly a cosmetic preference. I just think that having all the symbols (, :, and := be lined up on the left looks nicer than having the "dangle" at the end of the line in different positions.

fizruk commented 1 year ago

@TashiWalde Thanks a lot for your input on this!

I think your point 1 is addressed by the following point in the current styleguide:

  • Use meaningful names that accurately represent the concepts applied. For example, if a concept is known best by a special name, that name should probably be used.

Perhaps, we should provide explicit examples to make this point clearer.

Regarding point 2, personally, I completely agree and, in fact, this is the style that I'm used to in Haskell. But let's see what others have to say about this :)

fredrik-bakke commented 1 year ago

Thank you for the comment, @TashiWalde!

Nikolai is correct that the cited convention is there for instances like this. I'll make the style guide more clear on this point. Indeed, the naming conventions (and the guide as a whole) deserve much more motivation and detail, and the current phrasing is not too good. In fact, I'd like to move it closer to the referenced current proposal at agda-unimath.

Regarding your second point, I'm glad you like the conventions, and I also quite like your proposal. The one special case I can point out is when the infix binary operator/separator is multiple characters long and appears as the first token on a new line. In this case, a new line should be inserted right after the operator/separator as otherwise the code will not align well and excessive indentation may be introduced. We already use this convention for the walrus separator (:=), and it should also be used for identity type formation when the underlying type is passed explicitly (=_{...}).

TashiWalde commented 1 year ago

I think your point 1 is addressed by the following point in the current styleguide:

  • Use meaningful names that accurately represent the concepts applied. For example, if a concept is known best by a special name, that name should probably be used.

Nikolai is correct that the cited convention is there for instances like this. I'll make the style guide more clear on this point. Indeed, the naming conventions (and the guide as a whole) deserve much more motivation and detail, and the current phrasing is not too good. In fact, I'd like to move it closer to the referenced current proposal at agda-unimath.

Thanks to both of you for clarifying that point. I hadn't quite grasped the full significance of that bullet point.

Regarding your second point, I'm glad you like the conventions, and I also quite like your proposal. The one special case I can point out is when the infix binary operator/separator is multiple characters long and appears as the first token on a new line. In this case, a new line should be inserted right after the operator/separator as otherwise the code will not align well and excessive indentation may be introduced. We already use this convention for the walrus separator (:=), and it should also be used for identity type formation when the underlying type is passed explicitly (=_{...}).

Do I understand correctly that you mean something like this?

( ( a-term-of-a-type)
=_{ the-name-of-the-type}
  ( another-term-of-that-type))

If so, then I totally agree with this convention.

emilyriehl commented 1 year ago

Thanks all for this productive discussion. I strongly support @fredrik-bakke's desire to be released from the thankless task of telling us what everything should be called.

I like @TashiWalde's suggestion of moving characters like "," "=" and "->" to the start of the line rather than the end as they too convey important information about the structure of a proof. When having a character followed by a space followed by a term name fits with our tab alignment then the term name can be given on the same line. Otherwise, it should appear on the line below at the correct indentation.

@jonweinb or others do you want to weigh in as well?

emilyriehl commented 1 year ago

I have an additional request when amending the style guide.

In much of the library we write

#def theorem
  (variable : Type)
  : theorem-statement
  :=
     proof-uses-a-function
     (  first argument)
     (  second argument)
     (  etc)

but in the sample code in the style guide we have the function arguments indented one space further, so that the opening parentheses are indented from the function name.

#def theorem
  (variable : Type)
  : theorem-statement
  :=
     proof-uses-a-function
        (  first argument)
        (  second argument)
        (  etc)

I prefer to have the parentheses aligned with the function name but the arguments indented because this saves space. Could we alter these samples to reflect this?

fredrik-bakke commented 1 year ago

Do I understand correctly that you mean something like this?

( ( a-term-of-a-type)
=_{ the-name-of-the-type}
  ( another-term-of-that-type))

Yes, something like that. Actually, that turned out nicer than I was expecting! 😄

fredrik-bakke commented 1 year ago

I strongly support @fredrik-bakke's desire to be released from the thankless task of telling us what everything should be called.

Thank you for your understanding!

I have an additional request when amending the style guide.

Your suggested convention seems reasonable to me. As an extension to this change, we can also remove the convention regarding indenting code right after :=. This would save even more space without sacrificing readability.

With all the discussed changes, my understanding is that the new example in the style guide should be:

#def is-segal-is-local-horn-inclusion
  ( A : U)
  ( is-local-horn-inclusion-A : is-local-horn-inclusion A)
  : is-segal A
  :=
  \ x y z f g →
  contractible-fibers-is-equiv-projection
  ( Λ → A)
  ( \ k →
    Σ ( h : hom A (k (0₂ , 0₂)) (k (1₂ , 1₂)))
    , ( hom2 A
        ( k (0₂ , 0₂)) (k (1₂ , 0₂)) (k (1₂ , 1₂))
        ( \ t → k (t , 0₂))
        ( \ t → k (1₂ , t))
        ( h)))
  ( second
    ( equiv-comp
      ( Σ ( k : Λ → A)
        , Σ ( h : hom A (k (0₂ , 0₂)) (k (1₂ , 1₂)))
          , ( hom2 A
              ( k (0₂ , 0₂)) (k (1₂ , 0₂)) (k (1₂ , 1₂))
              ( \ t → k (t , 0₂))
              ( \ t → k (1₂ , t))
              ( h)))
      ( Δ² → A)
      ( Λ  → A)
      ( inv-equiv
        ( Δ² → A)
        ( Σ ( k : Λ → A)
          , Σ ( h : hom A (k (0₂ , 0₂)) (k (1₂ , 1₂)))
            , ( hom2 A
                ( k (0₂ , 0₂)) (k (1₂ , 0₂)) (k (1₂ , 1₂))
                ( \ t → k (t , 0₂))
                ( \ t → k (1₂ , t))
                ( h)))
        ( equiv-horn-restriction A))
      ( horn-restriction A , is-local-horn-inclusion-A)))
  ( horn A x y z f g)
emilyriehl commented 1 year ago

This seems good to me.

TashiWalde commented 1 year ago

If we are omitting the extra indentation after the walrus, one could also consider the variant where the := is all the way to the left, so that the signature and the body are visually more separated. Conceptually, this would correspond to interpreting #def _ := _ as a sort of 2-ary mixfix operator. I don't feel strongly either way.

#def is-segal-is-local-horn-inclusion
  ( A : U)
  ( is-local-horn-inclusion-A : is-local-horn-inclusion A)
  : is-segal A
:=
  \ x y z f g →
  contractible-fibers-is-equiv-projection
  ...
fizruk commented 1 year ago

If we are omitting the extra indentation after the walrus, one could also consider the variant where the := is all the way to the left ...

This does not work in the parser at the moment. The expectation is that any command #command (e.g. #define, #compute, #variables) has its content on the same line or indented (by at least one space).

fredrik-bakke commented 1 year ago

Alright, I've made a few additions to the style guide now. What is missing is a general formatting convention for extension types, and I'm not particularly happy with the naming convention section yet.

fredrik-bakke commented 1 year ago

I hope you don't mind, @TashiWalde. I paraphrased and used some of your comments in the style guide. I liked your phrasing a lot. :)

If we are omitting the extra indentation after the walrus, one could also consider the variant where the := is all the way to the left, so that the signature and the body are visually more separated. Conceptually, this would correspond to interpreting #def _ := _ as a sort of 2-ary mixfix operator. I don't feel strongly either way.

#def is-segal-is-local-horn-inclusion
  ( A : U)
  ( is-local-horn-inclusion-A : is-local-horn-inclusion A)
  : is-segal A
:=
  \ x y z f g →
  contractible-fibers-is-equiv-projection
  ...

Another solution could be to insert a double new line after the walrus separator for large definitions, for some definition of "large".

fizruk commented 1 year ago

Another solution could be to insert a double new line after the walrus separator for large definitions, for some definition of "large".

It is also possible to adjust rendering of the documentation (in mkdocs-plugin-rzk) to hide large definitions by default (using admonitions), e.g. like below.

Hidden:

Screenshot 2023-10-10 at 20 44 13

Revealed:

Screenshot 2023-10-10 at 20 44 27

Admonition used as follows (for implementation reference):

```rzk
#def is-equiv-unit-transposition uses (is-segal-A is-segal-B funext)
  : is-equiv
    ( nat-trans A (\ _ → A) (identity A) (comp A B A u f))
    ( (a : A) → (b : B) → (hom B (f a) b) → (hom A a (u b)))
    ( \ η a b k →
      comp-is-segal A is-segal-A a (u (f a)) (u b)
      ( \ t -> η t a)
      ( ap-hom B A u (f a) b k))

??? abstract "Definition of is-equiv-unit-transposition"

```rzk
:= ...
```


Note that this would only be applied when rendering, the sources would remain unchanged.
fredrik-bakke commented 1 year ago

Do we want to hide our proofs though? I know 1Lab does this, and I find it a little annoying to browse their website because of it.

fredrik-bakke commented 1 year ago

I think if a proof is so long and ugly that we want to hide it, then that suggests the proof should be refactored into smaller chunks instead.

aabounegm commented 1 year ago

I think if a proof is so long and ugly that we want to hide it, then that suggests the proof should be refactored into smaller chunks instead.

I generally agree, but how about cases when the author thinks the implementation of a particular proof is not as important as its type signature and prefers easier navigation? Perhaps we can have it as an opt-in option per codeblock rather than project-wide.

jonweinb commented 1 year ago

I think if a proof is so long and ugly that we want to hide it, then that suggests the proof should be refactored into smaller chunks instead.

I generally agree, but how about cases when the author thinks the implementation of a particular proof is not as important as its type signature and prefers easier navigation? Perhaps we can have it as an opt-in option per codeblock rather than project-wide.

I think that would be great!

TashiWalde commented 1 year ago

I thought I had seen this endorsed somewhere, but I can't find it there right now. We should add to the styleguide that it is permissible/encouraged to bundle multiple (short) arguments in a single line. So for example, I should be able (and be encouraged) to write

( function-with-a-name A B C D
  ( term-with-a-longer-name)
  ( f x) (f y)
  ( X) (Y) (Z) (W))

rather than

( function-with-a-name
  ( A)
  ( B)
  ( C)
  ( D)
  ( term-with-a-longer-name)
  ( f x)
  ( f y)
  ( X)
  ( Y)
  ( Z)
  ( W))
emilyriehl commented 1 year ago

@fredrik-bakke I see this is still marked as draft. What else were you hoping to add?

fredrik-bakke commented 1 year ago

@fredrik-bakke I see this is still marked as draft. What else were you hoping to add?

@emilyriehl Thanks for asking! I've received some comments about various wishes for additions to the style guide that I haven't added yet. Namely,

Moreover, I wish to regress a little on the naming conventions section.

If you'll allow me to omit the last two items, or currently have a good convention in mind, then I can finish this guide up right now actually.

emilyriehl commented 1 year ago

This sounds good to me. Surely this isn't the final word on the subject! But what you've done so far is great and I'd love to officially roll it out :)

fredrik-bakke commented 1 year ago

Me too!

emilyriehl commented 1 year ago

Great. Any final thoughts @TashiWalde @fizruk @jonweinb? (Or from anyone else?)

jonweinb commented 1 year ago

Great. Any final thoughts @TashiWalde @fizruk @jonweinb? (Or from anyone else?)

Looks good to me!

TashiWalde commented 1 year ago

@fredrik-bakke I see this is still marked as draft. What else were you hoping to add?

@emilyriehl Thanks for asking! I've received some comments about various wishes for additions to the style guide that I haven't added yet. Namely,

  • [x] Add convention for uses keyword
  • [ ] Add formatting convention for extension types
  • [ ] Add convention regarding when to hide proofs

Moreover, I wish to regress a little on the naming conventions section.

If you'll allow me to omit the last two items, or currently have a good convention in mind, then I can finish this guide up right now actually.

In an earlier comment, I asked to add an explanation for when and how it is permissible to bundle arguments in a single line. But rereading the guide now I didn't find it. Did you decide against adding it or did you just not get around to doing so?

EDIT: This feature does already appear in the example code. The only thing that is missing is a short explanation in the accompanying text.

In any case, it is just a very minor point, and I am otherwise happy with the state of the style guide.

fredrik-bakke commented 1 year ago

@TashiWalde I added the comment that "common sense should be applied", but this is probably not what you were looking for. Do you have a better rule in mind?

emilyriehl commented 12 months ago

Merging!