agda / agda-stdlib

The Agda standard library
https://wiki.portal.chalmers.se/agda/Libraries/StandardLibrary
Other
583 stars 236 forks source link

Standardisation of folds #278

Open MatthewDaggitt opened 6 years ago

MatthewDaggitt commented 6 years ago

At the moment naming and typing conventions for folds differ quite a lot across the various different data types. As far as I can see there are three basic types of folds present in the standard library (feel free to suggest more if I missed any!):

  1. Basic folds (applicable to everything List, Vec, Table etc.)
    foldr : ∀ {a b} {A : Set a} {B : Set b} → (A → B → B) → A → ∀ {n} → Vec A n → A
  2. Dependent folds (applicable to everything)
    foldr : ∀ {a b} {A : Set a} (B : ℕ → Set b) {m} → 
          (∀ {n} → A → B n → B (suc n)) → B zero → Vec A m → B m
  3. Non-empty folds (applicable to Vec, Table etc)
    foldr : ∀ {a} {A : Set a} → (A → A → A) → ∀ {n} → Vec A (suc n) → A

    Obviously 1. can be implemented in terms of 2. and 3. can be implemented in terms of 1.

The current state of play is:

Datatype Name for 1. Name for 2. Name for 3.
List foldr - N/A
Vec - foldr foldr₁
Table foldr - -
Product.N-ary - foldr - -

It would be great if we could add in some of the missing ones and come up with some consistent naming scheme for these across all data types for v1.0.

  1. In my opinion the name foldr should always be reserved for this "normal" fold.
  2. I don't have strong opinions or any good suggestions. Maybe something like dfoldr or foldrᵈ?
  3. I would suggest foldr₊ to represent the non-emptiness (rather than foldr⁺ which would clash for property intoduction proofs).
gallais commented 6 years ago

I'd suggest also adding:

induction : ∀ {a b} {A : Set a} (B : (n : ℕ) → Vec A n -> Set b) → 
            (∀ {n} x xs → B n xs → B (suc n) (x ∷ xs)) → B zero [] →
            ∀ {m} (xs : Vec A m) → B m xs
JacquesCarette commented 6 years ago

My instincts for naming (in Agda) goes in the opposite direction for 1 vs 2. The standard library is quite a mixed bag in this respect, but for simpler types (sums, products), the basic operations are all deeply dependent but use the 'classical' name anyways. Then there is often a non-dependent synonym.

Because the context is dependently typed, it feels that the simpler names should be attached to the dependently typed version.

As for 'non-empty', any chance of using subscript 'ne' instead? I do a lot of stuff with rigs, and in that setting using a + subscript is quite common, and would clash in meaning.

MatthewDaggitt commented 6 years ago

My instincts for naming (in Agda) goes in the opposite direction for 1 vs 2. The standard library is quite a mixed bag in this respect, but for simpler types (sums, products), the basic operations are all deeply dependent but use the 'classical' name anyways. Then there is often a non-dependent synonym.

The difference is that the dependent version of foldr can't be used interchangeably at the moment with the non-dependent version as it requires you to specify B explicitly, as well as x and xs in the function. I haven't investigated how inferable those parameters are...

As for 'non-empty', any chance of using subscript 'ne' instead? I do a lot of stuff with rigs, and in that setting using a + subscript is quite common, and would clash in meaning.

Okay, no problem, we can find a different name. Hmm foldrₙₑ looks a bit ugly to me, and a little more annoying to type. It's serviceable though. Maybe foldr+ is more palatable? Or anyone else have any other suggestions superior to both?

JacquesCarette commented 6 years ago

I agree, foldrₙₑ does look rather ugly, I withdraw it as a contender.

The rest was really 'my feeling'. Hopefully people with more experience will chyme in.

WolframKahl commented 6 years ago

I occasionally use names like foldrNE.

MatthewDaggitt commented 5 years ago

I still haven't put much thought into these namings and its quite a big backwards non-compatible change. I think this isn't going to get fixed for v1.0.

gallais commented 4 years ago

Given that we use List⁺ for non-empty lists, we could use foldr⁺ for folds on non-empty structures.

MatthewDaggitt commented 3 years ago

I'm coming back to this issue as I'm trying to add a non-empty fold over lists. @gallais your suggestion of foldr⁺ was my first thought as well, but the problem is that it clashes fairly catastrophically with the names introduction and elimination proofs in the Relation subfolders....

My current set of suggestions is as follows:

@JacquesCarette, to reiterate what I said above, while I see your point about many of the basic operations being dependent, I think one of the commonalities of those basic operations is that in the vast majority of cases they work equally well with dependent and non-dependent types and therefore the dependent-ness doesn't interfere with their ease of use. In contrast the dependent folds require the indexed type being passed in explicitly which means you'd never want to use it for non-dependent cases. Secondly, the dependent folds seem to be used much more rarely(?) (anecdotal I know but I've never come across the use of one in the wild).

gallais commented 3 years ago

Then again we have a well established (in Function.Base) naming scheme for the dependent & non-dependent version of the same operation:

As for the non-empty one, we could reuse the Haskell convention and call it foldr1.

JacquesCarette commented 3 years ago

I'm with @gallais on the naming. Though I don't feel too strongly about it.

MatthewDaggitt commented 3 years ago

Sure, I also don't feel that strongly so I'm happy to be outvoted :smile: Also good to go with Haskell's conventions. I'll update the matrix PR before merging it in, and then will get round to fixing the names for v2.0.

MatthewDaggitt commented 3 years ago

Final bikeshedding. Do people prefer foldr1 or foldr₁? We're currently using the latter...

JacquesCarette commented 3 years ago

No strong opinion on the placement of the 1.

jamesmckinna commented 2 years ago

And my last bit of (2022!) bikeshedding on this point: I had introduced foldr₀, foldl₀ for the non-dependent forms of the Vec folds, ignorant of the consensus here. I still think that there is an argument for its adoption (consistency with foldr₁ etc.) but I am happy to let this one go. I personally find primed identifiers so potentially (even desirably so, in some contexts) ambiguous, that I might find it hard to adopt/adapt to this convention for the rest of the library. But I will take that as a challenge to my own (mental; notational; mathematical) rigidity... ;-)

MatthewDaggitt commented 2 years ago

Okay so we've decided on :

@gallais will try and create a PR and we'll see how many inference problems we get.

jamesmckinna commented 2 years ago

Hmmm. Let me know how you get on with implicit {B = ...} in Data.Vec... where the library writer has lots of work to do to supply such Bs (even more so foldl, BTW)

Pro implicit: users can write foldr without thinking, in the non-dependent case

Contra/Pro explicit: they will be forced to write foldr _ or else its definitional equivalent foldr', and so 'learn' the pattern this issue is designed to standardise (and impose); and use of (dependent) foldl becomes more widespread, both in the library and its clients (wishful thinking?)

jamesmckinna commented 2 years ago

Did we get this right in the end? Can this be closed?

gallais commented 2 years ago

I haven't opened a PR for this yet

jamesmckinna commented 1 year ago

Suggest that whatever else might get done, lifting out the definitions of the types of the operators (as in Data.Vec.Base.FoldrOp, Data.Vec.Base.FoldlOp) used in dependent folding be lifted out to a sensible place (*.Core?) and shared across the various definitions, if that is indeed possible to do.

jamesmckinna commented 1 year ago

I note that Data.Maybe.Base actually defines an instance of @gallais 's induction definition above, and calls it maybe, with maybe′ as its non-dependent version.

PROPOSAL: we systematically do this for all the inductive Data types, but call the dependent version <typename>⁻ on the model of the style guide on pre- and post-conditions of functions. Not sure what to call the non-dependent version, so some additional discussion might be in order, and then (re-)implement the various folds in terms of these more fundamental constructs?