zkFold / zkfold-base

ZkFold's Base library
https://zkfold.io
MIT License
17 stars 7 forks source link

Proposal: Numerical Hierarchy #154

Open echatav opened 5 months ago

echatav commented 5 months ago

Trying to resolve our concepts of Arithmetic prime fields, Symbolic fields, our existing numerical hierarchy, Haskell's numerical hierachy, and the mathematics of semirings leads me to this numerical hierarchy proposal sketch:

-- Examples of a symbolic fields should include:
-- Prime p => Zp p
-- SymbolicField a => i -> a
-- PrimeField a => ArithmeticCircuit i Par1 a
type SymbolicField a = (Field a, FiniteChar a, Comparable a, BinaryExpansion a)

-- Prime fields should be `Prime p => Zp p` or `Zp` newtypes.
type PrimeField a = (SymbolicField a, Modular a)

class AdditiveMonoid a where
  (+) :: a -> a -> a
  zero :: a
  combineNatural :: [(Natural,a)] -> a
  scaleNatural :: Natural -> a -> a

class AdditiveMonoid a => AdditiveGroup a where
  negate :: a -> a
  (-) :: a -> a -> a
  combineInteger :: [(Integer,a)] -> a
  scaleInteger :: Integer -> a -> a

class MultiplicativeMonoid a where
  (*) :: a -> a -> a
  one :: a -> a -> a
  powNatural :: a -> Natural -> a

class MultiplicativeMonoid a => MultiplicativeGroup a where
  (/) :: a -> a -> a
  powInteger :: a -> Integer -> a

class From x a where from :: x -> a

type Semiring a = (AdditiveMonoid a, MultiplicativeMonoid a, From Natural a)
type Ring a = (AdditiveGroup a, MultiplicativeMonoid a, From Integer a)
type Field a = (AdditiveGroup a, MultiplicativeGroup a, From Rational a)

class Semiring a => FiniteChar a where
  characteristic :: Natural

class Semiring a => BinaryExpansion a where
  toBits :: a -> [a]
  fromBits :: [a] -> a

class Semiring a => SemiEuclidean a where
  degree :: a -> a
  divMod :: a -> a -> (a,a)
  euclidBit :: a -> a -> (a,a,a,a)

class (Ring a, SemiEuclidean a) => Euclidean a where
  quotRem :: a -> a -> (a,a)
  euclid :: a -> a -> (a,a,a)

class Ring a => Discrete a where
  dichotomy :: a -> a -> a

class Discrete a => Comparable a where
  trichotomy :: a -> a -> a

class Into y a where to :: a -> y

-- e.g. `Rational`, `Integer`, `Natural`, `Zp p`
type Real a = (Ord a, Semiring a, Into Rational a)
-- e.g. `Integer`, `Natural`, `Zp p`
type SemiIntegral a = (Real a, SemiEuclidean a, Into Integer a)
-- e.g. `Integer`, `Zp p`
type Integral a = (SemiIntegral a, Euclidean a)
-- `Zp p` and its newtypes, also `Word8`, `Word16`, `Word32` and `Word64`
type Modular a = (Integral a, Into Natural a)
TurtlePU commented 5 months ago

Nice suggestion! Some comments:

  1. Are scaleInteger and scaleNatural supposed to be used in Scale instances? Would it have issues with overlapping and/or coherence? (Same question for powNatural and powInteger)
  2. Having order of an algebraic structure be on a type level is useful for soundness constraints, e.g. (x :: a) ^ (y :: Zp p) is well-defined iff an order of a is p.
  3. On the same note, an order of a field is different from an order of its multiplicative subgroup, so it's best to not have MultiplicativeGroup a superclass of Field due to soundness concerns.
  4. Not every (AdditiveGroup a, MultiplicativeMonoid a) is a Ring a: distributivity might not hold
  5. Also I don't know if Discrete and Trichotomy are going to be needed as we needed only its ArithmeticCircuit instances for Symbolic.Eq and Symbolic.Ord and they're going to be implemented differently soon.
  6. What could be the laws for To? Currently, I described From laws as "from is a homomorphism"; evidently, to cannot be a homomorphism because $f(x + y) = f(x) + f(y)$ does not hold for $f : \mathbb{Z}_p \to \mathbb{N}$. Maybe the law is that from . to = id? This way, Real has to be a Field then.
  7. On the note of fields and rings: I think that Discrete should have a Field as a superclass because for a ring it is not guaranteed that $0 \ne 1$. Or introduce a "nontrivial ring" class/constraint
echatav commented 5 months ago
  1. Yes, they should be equal to the method values of the Scale instances. Maybe they're unneeded for that reason but I left out the Scale class for ease of thinking here. I'm not sure I love Scale and FromConstant and ToConstant and Exponent because they end up having bad inference properties and leave a long leash to introduce overlapping instances. I recapitulated the latter two as From and To but we'd still want a fromInteger = from @Integer function so that we can overload numeric literals using Haskell's rebindable syntax.

  2. At least in Haskell's version of (^), (x :: a) ^ (y :: b) is well defined for (Num a, Integral b) and a similar (more correct) thing holds in my proposal:

-- (^) :: (Ring a, Modular b) => a -> b -> a
(^) :: (MultiplicativeMonoid a, To Natural b) => a -> b -> a
a ^ b = powNatural a (to b)

-- (^^) :: (Field a, Integral b) => a -> b -> a
(^^) :: (MultiplicativeGroup a, To Integer b) => a -> b -> a
a ^^ b = powInteger a (to b)

I think that ends up making sense and no Exponent class may be needed.

  1. I'm using FiniteOrder to describe any Semiring with a minimum, positive order :: Natural such that:
scaleNatural order one = zero

Rather than defining Finite to describe a finite type. This way it's natural to consider non-finite semirings with finite order such as FiniteOrder a => i -> a.

  1. True enough, but Haskell doesn't deal with laws so this comes down to convention. We could define Semiring/Ring/Field as classes with no methods to force one to write down an instance declaration as a sign that you think the laws hold. Or we could drop the From superconstraints and make fromNatural, fromInteger, fromRational be methods. I think the way I wrote it is economical and we can instead make the assumption that if one makes instances for all of the superconstraints, then one is signaling they think the laws hold.

  2. I don't know how they'll be differently implemented so I can't comment.

  3. Indeed, the law for To is that it's a partial inverse of From, from . to = id. Additionally, to should have compatibility laws with Prelude.== and Prelude.compare, e.g. a == b = to a == to b. Also to should be idempotent to = to . to. Again, maybe to be more explicit we don't use From/To and instead make fromNatural, fromInteger, fromRational, toRational, toInteger, and toNatural into methods. This is particularly if you want to be more careful and explicit about lawfulness, but I think it's fine as in the proposal. 🤷

  4. 0 /= 1 is a law for Discrete (and 0 /= 1 /= -1 /= 0 is a law for Trichotomy) and holds in non-fields too such as Zp 4.

TurtlePU commented 5 months ago
  1. Completely agree about bad inference and overlapping properties, current version was the best I could come up with but still am not completely satisfied. On rebindable syntax: is it possible to have a separate fromNatural and fromInteger? As I see from the link you provided (and from what I remember) it's only fromInteger and literals for naturals have a lint rule in linters (which is a little sad IMO). And, while we're still on this topic: I guess you too think that we need some kind of our own Prelude (or maybe we need to take a look at numeric-prelude), just sayin' that we should discuss this some time in the future.
  2. With this definition, x ^ (y + z) = (x ^ y) * (x ^ z) does not hold for y, z :: Zp p if an order of x does not divide p; I would expect it to hold, but maybe it is okay... (though this property is expected to hold in pairing which we already use)
  3. I think you are describing a ring characteristic. Note that even in finite fields, this might not equal to an order of a structure, for example $\mathrm{char}(\mathbb{F}_{p^n}) = p$.
  4. There are near semirings which are not fully distributive, but I honestly was deliberately searching for this kind of structure right now and I don't know if we're interested in differentiating these kinds of structures from (semi)rings. Maybe keeping (semi)rings as a type alias is okay.
  5. Yeah, me too, we'll see later.
  6. Yeah, typeclass declarations are fine as they are! Or we can add a superclass constraint From y a => To y a if we want to. Talking about laws:
    • a == b = to a == to b follows from from . to = id if == is a Leibniz equality (and I think this is good to assume)
    • to . to = to codifies the discipline we currently have about these conversion functions and I like it. I wonder if this together with from . to = id makes a complete description.
  7. Ok!
echatav commented 5 months ago

@TurtlePU I edited my response and original post ^ many times and ended up responding to your response! For instance, yeah, just fromInteger which should be totes fine for Rings anyway.

echatav commented 4 months ago
newtype Mod (int :: Type) (n :: Natural) = Mod int
type Zmod = Mod Integer

>>> 8 :: Zmod 6
2

(SemiIntegral int, KnownNat n) => Mod int n is Modular and (SemiIntegral int, Prime p) => Mod int p is a PrimeField, allowing fixed size fields (with convenient Bits and Binary instances) such as:

type Word48 = LargeKey Word16 Word32
type Fq = Word48 `Mod` BLS12_381_Base
TurtlePU commented 3 months ago

My 5 cents

class From a a => AdditiveSemigroup a where {...} class (AdditiveSemigroup a, Scale Natural a) => AdditiveMonoid a where {...} class (AdditiveMonoid a, Scale Integer a) => AdditiveGroup a where {...}

type MultiplicativeSemigroup a = (Scale a a, From a a) class (MultiplicativeSemigroup a, Exponent a Natural) => MultiplicativeMonoid a where {...} class (MultiplicativeMonoid a, Exponent a Integer) => MultiplicativeGroup a where {...}

type Semiring a = (AdditiveMonoid a, MultiplicativeMonoid a) type Ring a = (Semiring a, AdditiveGroup a) class (Ring a, Exponent a Integer) => Field a where {...}

* A suggestion to remove `DiscreteField` (both versions) and `TrichotomyField` as they are not used anywhere.
* A suggestion to remove `VectorSpace` module as the definitions inside are superceded by `SymbolicData` and `Representable` in our usecases.
* A suggestion to remove `BinaryExpansion` class and replace its usages either with direct `toConstant :: a -> Natural` where it is used with `padBits` or with more general `EuclideanDomain` class.
* A suggestion to add `isZero` to `Field` class definition to make more instances lawful:
```haskell
-- | Class of (almost) fields. Laws:
--
-- [Idempotent] @isZero x * isZero x == isZero x@
-- [Agreement] @x // y == x * recip y@
-- [Inverse] @x * recip x = one - isZero x@
-- ... laws for roots of unity ...
class (Ring a, Exponent a Integer) => Field a where
  {-# MINIMAL isZero, (recip | (//)) #-}

  isZero :: a -> a
  default isZero :: Eq a => a -> a
  isZero = bool zero one . (== zero)

  recip :: a -> a
  recip = (one //)

  (//) :: a -> a -> a
  x // y = x * recip y

  rootOfUnity :: Natural -> Maybe a
  rootOfUnity = ...
TurtlePU commented 3 months ago

From #177:

TurtlePU commented 3 months ago

Another radical idea is to write all groups in additive notation, so the hierarchy becomes

class From a b where
  from :: a -> b

class Scale b a where
  (*) :: b -> a -> a
  multiScale :: Foldable f => f (b, a) -> a

class From a a => Semigroup a where (+) :: a -> a -> a
class (Semigroup a, Scale Natural a) => Additive a where zero :: a
class (Additive a, Scale Integer a) => Group a where
  negate :: a -> a
  (-) :: a -> a -> a

class Exponent b a where
  (^) :: a -> b -> a
  multiExp :: Foldable f => f (a, b) -> a

class (Additive a, Scale a a, From Natural a, Exponent Natural a) => Semiring a where one :: a
type Ring a = (Group a, Semiring a, From Integer a)
class (Ring a, Exponent Integer a) => Field a where
  recip :: a -> a
  (/) :: a -> a -> a
  isZero :: a -> a
  rootOfUnity :: Natural -> Maybe a
echatav commented 3 months ago

The latest version of my hierarchy proposal is this implementation proof of concept.

In this reference, we have

This would let us directly use fixed-width, unsigned integers as a base for fields like type Fr = Mod Word48 BLS12_381_Scalar, which accords better with the specification for the curve.

TurtlePU commented 2 months ago

An idea on common interface to various polynomial types.

Polynomials with fixed coefficient and variable type (including univariate):

-- | Universal property of polynomials \(a[v]\) is that it is an initial algebra over @a@.
-- Properties:
--
-- 1. @'var'@ should be an injection
-- 2. @'evalPoly' f@ should be an algebra homomorphism for all @f@
class Algebra a p => Poly a v p | p -> a, p -> v where
  -- | Injection
  var :: v -> p
  -- | Initial map
  evalPoly :: Algebra a x => (v -> x) -> p -> x

type UPoly a p = Poly a () p

evalUPoly :: (UPoly a p, Algebra a x) => p -> x -> x
evalUPoly p x = evalPoly (const x) p

poly :: Poly a v p => (forall x. Algebra a x => (v -> x) -> x) -> p
poly f = f var

uPoly :: UPoly a p => (forall x. Algebra a x => x -> x) -> p
uPoly f = f (var ())

castPoly :: (Poly a v p, Poly a v q) => p -> q
-- | I guess it should follow from properties of @'Poly'@ that @'castPoly'@ is an isomorphism
castPoly = evalPoly var

castPoly' :: (Poly a v p, Poly a v' p') => (v -> v') -> p -> p'
castPoly' f = evalPoly (var . f)

embedUPoly :: (UPoly a p, Poly a v q) => v -> p -> q
embedUPoly x = castPoly' (const x)

Univariate polynomials are basically a collection of coefficients. Common API of various collections needs further investigation, but at least we can state that

type UPoly1 p = (forall a. UPoly a (p a), Functor p, Foldable p)

Multivariate polynomials on the other hand are more complex. We can start with

class (forall a v. (Ring a, c v) => Poly a v (p a v)) => MPoly2 c p where
  bimapPoly :: c v' => (a -> a') -> (v -> v') -> p a v -> p a' v'

subPoly :: (MPoly2 c p, Ring a, c v') => p a v -> (v -> p a v') -> p a v'
subPoly p f = evalPoly f p

P.S. Actually, univariate polynomials are also MPoly2 c p with c v = (v ~ ()).

echatav commented 2 months ago

If we could unify (multivariate & univariate) types then we shouldn't need a typeclass. In #177 I rewrote multivariate monomials and polynomials without extra classes, the biggest difference being I made polynomials a map-of-maps as well as some optimizations for linear combination & monomial evaluation.

vlasin commented 2 months ago

On the other hand, univariate polynomials must be optimized for performance, and they have their specific versions for some operations like multiplication and division. We have FFT-based multiplication for univariate polynomials.

TurtlePU commented 2 months ago

@echatav, having a single type would surely be most optimal, however univariate polynomials enjoy more optimizations (e.g. multiplication via FFT and via Karatsuba algorithm) which multivariate do not. If this can be incorporated into a single datatype cleanly, then sure, why not