Proposal: Numerical Hierarchy

echatav commented 5 months ago

Trying to resolve our concepts of Arithmetic prime fields, Symbolic fields, our existing numerical hierarchy, Haskell's numerical hierachy, and the mathematics of semirings leads me to this numerical hierarchy proposal sketch:

-- Examples of a symbolic fields should include:
-- Prime p => Zp p
-- SymbolicField a => i -> a
-- PrimeField a => ArithmeticCircuit i Par1 a
type SymbolicField a = (Field a, FiniteChar a, Comparable a, BinaryExpansion a)

-- Prime fields should be `Prime p => Zp p` or `Zp` newtypes.
type PrimeField a = (SymbolicField a, Modular a)

class AdditiveMonoid a where
  (+) :: a -> a -> a
  zero :: a
  combineNatural :: [(Natural,a)] -> a
  scaleNatural :: Natural -> a -> a

class AdditiveMonoid a => AdditiveGroup a where
  negate :: a -> a
  (-) :: a -> a -> a
  combineInteger :: [(Integer,a)] -> a
  scaleInteger :: Integer -> a -> a

class MultiplicativeMonoid a where
  (*) :: a -> a -> a
  one :: a -> a -> a
  powNatural :: a -> Natural -> a

class MultiplicativeMonoid a => MultiplicativeGroup a where
  (/) :: a -> a -> a
  powInteger :: a -> Integer -> a

class From x a where from :: x -> a

type Semiring a = (AdditiveMonoid a, MultiplicativeMonoid a, From Natural a)
type Ring a = (AdditiveGroup a, MultiplicativeMonoid a, From Integer a)
type Field a = (AdditiveGroup a, MultiplicativeGroup a, From Rational a)

class Semiring a => FiniteChar a where
  characteristic :: Natural

class Semiring a => BinaryExpansion a where
  toBits :: a -> [a]
  fromBits :: [a] -> a

class Semiring a => SemiEuclidean a where
  degree :: a -> a
  divMod :: a -> a -> (a,a)
  euclidBit :: a -> a -> (a,a,a,a)

class (Ring a, SemiEuclidean a) => Euclidean a where
  quotRem :: a -> a -> (a,a)
  euclid :: a -> a -> (a,a,a)

class Ring a => Discrete a where
  dichotomy :: a -> a -> a

class Discrete a => Comparable a where
  trichotomy :: a -> a -> a

class Into y a where to :: a -> y

-- e.g. `Rational`, `Integer`, `Natural`, `Zp p`
type Real a = (Ord a, Semiring a, Into Rational a)
-- e.g. `Integer`, `Natural`, `Zp p`
type SemiIntegral a = (Real a, SemiEuclidean a, Into Integer a)
-- e.g. `Integer`, `Zp p`
type Integral a = (SemiIntegral a, Euclidean a)
-- `Zp p` and its newtypes, also `Word8`, `Word16`, `Word32` and `Word64`
type Modular a = (Integral a, Into Natural a)

TurtlePU commented 5 months ago

Nice suggestion! Some comments:

Are scaleInteger and scaleNatural supposed to be used in Scale instances? Would it have issues with overlapping and/or coherence? (Same question for powNatural and powInteger)
Having order of an algebraic structure be on a type level is useful for soundness constraints, e.g. (x :: a) ^ (y :: Zp p) is well-defined iff an order of a is p.
On the same note, an order of a field is different from an order of its multiplicative subgroup, so it's best to not have MultiplicativeGroup a superclass of Field due to soundness concerns.
Not every (AdditiveGroup a, MultiplicativeMonoid a) is a Ring a: distributivity might not hold
Also I don't know if Discrete and Trichotomy are going to be needed as we needed only its ArithmeticCircuit instances for Symbolic.Eq and Symbolic.Ord and they're going to be implemented differently soon.
What could be the laws for To? Currently, I described From laws as "from is a homomorphism"; evidently, to cannot be a homomorphism because $f(x + y) = f(x) + f(y)$ does not hold for $f : \mathbb{Z}_p \to \mathbb{N}$. Maybe the law is that from . to = id? This way, Real has to be a Field then.
On the note of fields and rings: I think that Discrete should have a Field as a superclass because for a ring it is not guaranteed that $0 \ne 1$. Or introduce a "nontrivial ring" class/constraint

echatav commented 5 months ago

Yes, they should be equal to the method values of the Scale instances. Maybe they're unneeded for that reason but I left out the Scale class for ease of thinking here. I'm not sure I love Scale and FromConstant and ToConstant and Exponent because they end up having bad inference properties and leave a long leash to introduce overlapping instances. I recapitulated the latter two as From and To but we'd still want a fromInteger = from @Integer function so that we can overload numeric literals using Haskell's rebindable syntax.
At least in Haskell's version of (^), (x :: a) ^ (y :: b) is well defined for (Num a, Integral b) and a similar (more correct) thing holds in my proposal:

-- (^) :: (Ring a, Modular b) => a -> b -> a
(^) :: (MultiplicativeMonoid a, To Natural b) => a -> b -> a
a ^ b = powNatural a (to b)

-- (^^) :: (Field a, Integral b) => a -> b -> a
(^^) :: (MultiplicativeGroup a, To Integer b) => a -> b -> a
a ^^ b = powInteger a (to b)

I think that ends up making sense and no Exponent class may be needed.

I'm using FiniteOrder to describe any Semiring with a minimum, positive order :: Natural such that:

scaleNatural order one = zero

Rather than defining Finite to describe a finite type. This way it's natural to consider non-finite semirings with finite order such as FiniteOrder a => i -> a.

True enough, but Haskell doesn't deal with laws so this comes down to convention. We could define Semiring/Ring/Field as classes with no methods to force one to write down an instance declaration as a sign that you think the laws hold. Or we could drop the From superconstraints and make fromNatural, fromInteger, fromRational be methods. I think the way I wrote it is economical and we can instead make the assumption that if one makes instances for all of the superconstraints, then one is signaling they think the laws hold.
I don't know how they'll be differently implemented so I can't comment.
Indeed, the law for To is that it's a partial inverse of From, from . to = id. Additionally, to should have compatibility laws with Prelude.== and Prelude.compare, e.g. a == b = to a == to b. Also to should be idempotent to = to . to. Again, maybe to be more explicit we don't use From/To and instead make fromNatural, fromInteger, fromRational, toRational, toInteger, and toNatural into methods. This is particularly if you want to be more careful and explicit about lawfulness, but I think it's fine as in the proposal. 🤷
0 /= 1 is a law for Discrete (and 0 /= 1 /= -1 /= 0 is a law for Trichotomy) and holds in non-fields too such as Zp 4.

TurtlePU commented 5 months ago

Completely agree about bad inference and overlapping properties, current version was the best I could come up with but still am not completely satisfied. On rebindable syntax: is it possible to have a separate fromNatural and fromInteger? As I see from the link you provided (and from what I remember) it's only fromInteger and literals for naturals have a lint rule in linters (which is a little sad IMO). And, while we're still on this topic: I guess you too think that we need some kind of our own Prelude (or maybe we need to take a look at numeric-prelude), just sayin' that we should discuss this some time in the future.
With this definition, x ^ (y + z) = (x ^ y) * (x ^ z) does not hold for y, z :: Zp p if an order of x does not divide p; I would expect it to hold, but maybe it is okay... (though this property is expected to hold in pairing which we already use)
I think you are describing a ring characteristic. Note that even in finite fields, this might not equal to an order of a structure, for example $\mathrm{char}(\mathbb{F}_{p^n}) = p$.
There are near semirings which are not fully distributive, but I honestly was deliberately searching for this kind of structure right now and I don't know if we're interested in differentiating these kinds of structures from (semi)rings. Maybe keeping (semi)rings as a type alias is okay.
Yeah, me too, we'll see later.
Yeah, typeclass declarations are fine as they are! Or we can add a superclass constraint From y a => To y a if we want to. Talking about laws:
- a == b = to a == to b follows from from . to = id if == is a Leibniz equality (and I think this is good to assume)
- to . to = to codifies the discipline we currently have about these conversion functions and I like it. I wonder if this together with from . to = id makes a complete description.
Ok!

echatav commented 5 months ago

@TurtlePU I edited my response and original post ^ many times and ended up responding to your response! For instance, yeah, just fromInteger which should be totes fine for Rings anyway.

echatav commented 4 months ago

newtype Mod (int :: Type) (n :: Natural) = Mod int
type Zmod = Mod Integer

>>> 8 :: Zmod 6
2

(SemiIntegral int, KnownNat n) => Mod int n is Modular and (SemiIntegral int, Prime p) => Mod int p is a PrimeField, allowing fixed size fields (with convenient Bits and Binary instances) such as:

type Word48 = LargeKey Word16 Word32
type Fq = Word48 `Mod` BLS12_381_Base

TurtlePU commented 3 months ago

My 5 cents

It if often the case when working around Symbolic code that we need a to be Arithmetic. GHC reports this in a roundabout way, like: "it is not known whether Bits a23 ~ [a23]". In general, keeping type aliases during error reporting is hard in GHC, so maybe we just declare class (PrimeField a, Eq a, Ord a, BinaryExpansion a, Bits a ~ [a]) => Arithmetic a with no additional fields inside, just as a marker. It would also provide better documentation as it can be easily seen which fields do we consider Arithmetic.
Some superclass constraints (introduced by me, sorry >_<) are used only for laws and complicate some instance declarations, so it is better to remove them, namely superclass constraints in Scale and Exponent.
Having exponent as a last type parameter in Exponent a b prohibits deriving, see BLS12_381_GT instances for example. Maybe Exponent b a is better in the end?
Default implementation in Scale causes GHC to error when someone adds a stub for instance Scale and actually is not the most common one: currently, the most common situation is a pattern of instance Scale a b => Scale a c where b is some type which is somehow related to c. Maybe it is better to change the default implementation to the one for scaling by Natural and Integer?
Same for FromConstant: FromConstant a b => FromConstant a c pattern is quite common. However, in certain cases, the instance FromConstant a a prohibits such a definition. Also this definition sometimes causes issues with coherence because of possible overlap. In addition, we have to provide it in generic contexts anyway, like (FromConstant a a, Scale a a). Maybe it is better to remove instance FromConstant a a and turn it into a default implementation instead?
Also, looking at how we use FromConstant and fromConstant practically everywhere, I suggest we rename it to class From a b where { from :: a -> b }, as @echatav proposed too.
Suggestion for ToConstant: move the "constant" type to the type parameter, like this:
```
class ToConst a where
type Const a :: Type
toConst :: a -> Const a
```
The intended meaning is that Const a is the simplest type into which elements of a can be turned. The added benefit is that we can define the following:
```
cast :: (ToConst a, From (Const a) b) => a -> b
cast = from . toConst
```
A trick to avoid specifying From a a and Scale a a constraints:
```
class Scale a b where
(*) :: a -> b -> b
```

class From a a => AdditiveSemigroup a where {...} class (AdditiveSemigroup a, Scale Natural a) => AdditiveMonoid a where {...} class (AdditiveMonoid a, Scale Integer a) => AdditiveGroup a where {...}

type MultiplicativeSemigroup a = (Scale a a, From a a) class (MultiplicativeSemigroup a, Exponent a Natural) => MultiplicativeMonoid a where {...} class (MultiplicativeMonoid a, Exponent a Integer) => MultiplicativeGroup a where {...}

type Semiring a = (AdditiveMonoid a, MultiplicativeMonoid a) type Ring a = (Semiring a, AdditiveGroup a) class (Ring a, Exponent a Integer) => Field a where {...}

* A suggestion to remove `DiscreteField` (both versions) and `TrichotomyField` as they are not used anywhere.
* A suggestion to remove `VectorSpace` module as the definitions inside are superceded by `SymbolicData` and `Representable` in our usecases.
* A suggestion to remove `BinaryExpansion` class and replace its usages either with direct `toConstant :: a -> Natural` where it is used with `padBits` or with more general `EuclideanDomain` class.
* A suggestion to add `isZero` to `Field` class definition to make more instances lawful:
```haskell
-- | Class of (almost) fields. Laws:
--
-- [Idempotent] @isZero x * isZero x == isZero x@
-- [Agreement] @x // y == x * recip y@
-- [Inverse] @x * recip x = one - isZero x@
-- ... laws for roots of unity ...
class (Ring a, Exponent a Integer) => Field a where
  {-# MINIMAL isZero, (recip | (//)) #-}

  isZero :: a -> a
  default isZero :: Eq a => a -> a
  isZero = bool zero one . (== zero)

  recip :: a -> a
  recip = (one //)

  (//) :: a -> a -> a
  x // y = x * recip y

  rootOfUnity :: Natural -> Maybe a
  rootOfUnity = ...

TurtlePU commented 3 months ago

From #177:

Maybe we do not need AdditiveSemigroup and MultiplicativeSemigroup after all, and start with AdditiveMonoid and MultiplicativeMonoid
Find a place for optimized Scale c a => [(c, a)] -> a functions. Maybe a separate MultiScale class would do to make a trick I proposed possible? (Exponent, on the other hand, can contain the multi-exponent function defined inside).
So what does Finite a define for symbolic datatypes a? It is not order in a strict sense, as cardinalities of function types (and circuits can be viewed as functions of special kind) are not the cardinalities of codomains. Is this field characteristic? Or something else?
Separation of SemiEuclidean and Euclidean looks interesting, maybe we will actually need it
I think we need to make the polynomials more comfortable to use, maybe sacrificing the generality. Do we need to be generic over polynomial exponent type and collection types in the end? If we do, maybe it is better to define a class Polynomial and have proper separate polynomial datatypes instead of one type with a bunch of proxy type parameters? Also the type signature of evalPolynomial can be improved.

TurtlePU commented 3 months ago

Another radical idea is to write all groups in additive notation, so the hierarchy becomes

class From a b where
  from :: a -> b

class Scale b a where
  (*) :: b -> a -> a
  multiScale :: Foldable f => f (b, a) -> a

class From a a => Semigroup a where (+) :: a -> a -> a
class (Semigroup a, Scale Natural a) => Additive a where zero :: a
class (Additive a, Scale Integer a) => Group a where
  negate :: a -> a
  (-) :: a -> a -> a

class Exponent b a where
  (^) :: a -> b -> a
  multiExp :: Foldable f => f (a, b) -> a

class (Additive a, Scale a a, From Natural a, Exponent Natural a) => Semiring a where one :: a
type Ring a = (Group a, Semiring a, From Integer a)
class (Ring a, Exponent Integer a) => Field a where
  recip :: a -> a
  (/) :: a -> a -> a
  isZero :: a -> a
  rootOfUnity :: Natural -> Maybe a

echatav commented 3 months ago

The latest version of my hierarchy proposal is this implementation proof of concept.

In this reference, we have

separation of SemiEuclidean from Euclidean, which is separated from Integral
separation of SemiIntegral from Integral
A Mod int n type with (SemiIntegral int, Prime p) => Field (Mod int p)

This would let us directly use fixed-width, unsigned integers as a base for fields like type Fr = Mod Word48 BLS12_381_Scalar, which accords better with the specification for the curve.

TurtlePU commented 2 months ago

An idea on common interface to various polynomial types.

Polynomials with fixed coefficient and variable type (including univariate):

-- | Universal property of polynomials \(a[v]\) is that it is an initial algebra over @a@.
-- Properties:
--
-- 1. @'var'@ should be an injection
-- 2. @'evalPoly' f@ should be an algebra homomorphism for all @f@
class Algebra a p => Poly a v p | p -> a, p -> v where
  -- | Injection
  var :: v -> p
  -- | Initial map
  evalPoly :: Algebra a x => (v -> x) -> p -> x

type UPoly a p = Poly a () p

evalUPoly :: (UPoly a p, Algebra a x) => p -> x -> x
evalUPoly p x = evalPoly (const x) p

poly :: Poly a v p => (forall x. Algebra a x => (v -> x) -> x) -> p
poly f = f var

uPoly :: UPoly a p => (forall x. Algebra a x => x -> x) -> p
uPoly f = f (var ())

castPoly :: (Poly a v p, Poly a v q) => p -> q
-- | I guess it should follow from properties of @'Poly'@ that @'castPoly'@ is an isomorphism
castPoly = evalPoly var

castPoly' :: (Poly a v p, Poly a v' p') => (v -> v') -> p -> p'
castPoly' f = evalPoly (var . f)

embedUPoly :: (UPoly a p, Poly a v q) => v -> p -> q
embedUPoly x = castPoly' (const x)

Univariate polynomials are basically a collection of coefficients. Common API of various collections needs further investigation, but at least we can state that

type UPoly1 p = (forall a. UPoly a (p a), Functor p, Foldable p)

Multivariate polynomials on the other hand are more complex. We can start with

class (forall a v. (Ring a, c v) => Poly a v (p a v)) => MPoly2 c p where
  bimapPoly :: c v' => (a -> a') -> (v -> v') -> p a v -> p a' v'

subPoly :: (MPoly2 c p, Ring a, c v') => p a v -> (v -> p a v') -> p a v'
subPoly p f = evalPoly f p

P.S. Actually, univariate polynomials are also MPoly2 c p with c v = (v ~ ()).

echatav commented 2 months ago

If we could unify (multivariate & univariate) types then we shouldn't need a typeclass. In #177 I rewrote multivariate monomials and polynomials without extra classes, the biggest difference being I made polynomials a map-of-maps as well as some optimizations for linear combination & monomial evaluation.

vlasin commented 2 months ago

On the other hand, univariate polynomials must be optimized for performance, and they have their specific versions for some operations like multiplication and division. We have FFT-based multiplication for univariate polynomials.

TurtlePU commented 2 months ago

@echatav, having a single type would surely be most optimal, however univariate polynomials enjoy more optimizations (e.g. multiplication via FFT and via Karatsuba algorithm) which multivariate do not. If this can be incorporated into a single datatype cleanly, then sure, why not

zkFold / zkfold-base

Proposal: Numerical Hierarchy #154