+++
title = "A prime of monad with Python and Rust"
date = "2021-1-8"
cover = ""
tags = ["Python", "Rust", "functional programming"]
description = "Comparison of defining shared behaviors by OOP and algebraic data types"
showFullContent = false
+++
Python can be seen as a dialect of Lisp with "traditional" syntax - Python for Lisp Programmers, Peter Norvig
All told, a monad in X is just a monoid in the category of endofunctors of X, with product × replaced by composition of endofunctors and unit set by the identity endofunctor. - Categories for the Working Mathematician
Overview
Monad is a fancy word abused by lots of mathematicians and programmers for education or entertainment. You might have read lots of articles or blogs but still have not grasped the essence. That's totally fine because I'm also...
Definitions
In the fore-mentioned quotes, there are four terms: monad, monoid, category and endofunctors. To understand monad, let's take the last three first:
monoid
category
endofunctor
Monoid == semigroup + identity element
So what is a "semigroup"? The definition is a set together with an associative binary operation. What makes a semigroup different is that a semigroup does not need to have one identity element ε and inverse elements for every element in the set.
Let's get started with the simplest set and operation: positive integers and addition +.
Associativity: (a + b) + c == a + (b + c)
No identity element: a + b != a and a + b != b
If we add 0 to the set, then the semigroup becomes a monoid.
Identity: a + 0 = a and 0 + a == a
No invertibility: if a != 0, you cannot find b to satisfy a + b == 0
In category theory, the "set" in a monoid could be any algebraic structure (e.g. set, group, semigroup etc), not just a number set. The monoid we discussed above is the monoid in abstract algebra, however, in category theory this notion could be extended and generalized. We'll come back to this later.
Category == "objects" and "arrows"
A category is a collection of "objects" that are linked by "arrows". This is a rather abstract definition, so here we can just think the "objects" are sets and "arrows" are functions for now.
The term "arrows" are also called "morphisms". If f is an arrow from object A to B, we can write: f: A -> B
Note that the sets and functions here include all sets and functions you can imagine: they are not just numbers but could also be strings in Python.
set_a = {"ab", "ac", "ad"}
def func(s: str) -> str:
if s == "ab":
return "ac"
elif s == "ac":
return "ab"
else:
return "ad"
If we apply the func above to every string in Python, the result is always in set_a, so this is a mapping from str set to set_a.
If a function can accept every element from a set A, it's also called a total function. The func above and lambda x: x + 1 are two simple total functions.
Again, the "objects" in a category are not necessarily sets. Formally, a category is defined as a set with three elements:
objects
morphisms
e.g. f: A -> B denotes the morphism from object A to B
composition of morphisms
e.g. f: A -> B and g: B -> C can be composed as f∘g: A -> C
And the two axioms hold:
Associativity
If f: A -> B, g: B -> C and h: C -> D, then(f ∘ g) ∘ h == f ∘ (g ∘ h)
Identity
For every object X, there exists a morphism: such that every morphism f: A -> X satisfies f ∘ id_x == f and g: X -> B satisfies id_x ∘ g == g.
Hmm...does it just look like a monoid?
Let's take a simple example: objects are A and B and morphisms are f: A -> B and g: B -> A. According to the axioms, we'll get id_A == f ∘ g and id_b == g ∘ f.
In Python, we can use the type hint to model the category:
from typing import TypeVar, Callable
A = TypeVar("A")
B = TypeVar("B")
f = Callable[[A], B]
g = Callable[[B], A]
id_A = Callable[[A], A]
id_B = Callable[[B], B]
The A and B can be any type like int or str.
Endofunctor == a functor that maps category C to itself
Functor
Before we get into endofunctor, let's take a look on functor. A functor is a mapping from category C and D: F: C -> D:
Maps any object A in C to F(A) in D
Maps any morphism f: A -> B in C to F(f): F(A) -> F(B) in D
If the morphism id_A is an identity morphism in C, then F(id_A) must also be an identity morphism in D
The functor should distribute over compositions: F(f ∘ g) == F(f) ∘ F(g)
The key point is that a functor not only maps a object A to F(A) but also maps the morphism f to F(f). Let's take another category: object u8 and a morphism f: u8 -> u8 in Rust.
u8 means uint8: unsigned int from 0 to 255
Obviously, the morphism f: u8 -> u8 can be instantiated into infinite functions. Let's take the following two as an example:
fn add1(x: u8) -> u8 {
x + 1
}
fn mul2(x: u8) -> u8 {
x * 2
}
In Rust, there is a type called Option<T> and T is a generic type.
F: FnOnce(T) -> U is a function type, which takes an argument T and returns U. T and U can be the same type or not, and FnOnce means the function should only be called once. Again, U and T are generic types.
If we ignore the FnOnce, it can be regarded as a morphism F: T -> U. Back the the u8 category above, let's see what Option<T> can do on it.
For the object u8 (which is the only object), Option<T>::map maps it to Option<u8>
For the morphism f: u8 -> u8, can we map it to g: Option<u8> -> Option<u8>?
Well, since there is no such a method in Option<T>, we have to implement it on our own.
Say we have a function F: Fn(U) -> T, the goal is to convert it to G: Fn(Option<U>) -> Option<T>:
fn fmap<T, U>(f: impl Fn(T) -> U) -> impl Fn(Option<T>) -> Option<U> {
move |a| a.map(&f)
}
let add1 = |x| x + 1;
let mul2 = |x| x * 2;
assert!(fmap(add1)(Some(1)) == Some(2));
assert!(fmap(add1)(None) == None);
assert!(fmap(mul2)(Some(2)) == Some(4));
It works as expected. Does fmap satisfy the axioms?
With fmap and map, we can maps every object and morphism in the u8 category to the Option<T> category. More generally, we can maps every object and morphism in any T category to Option<U> category, as long as T and U are legit Rust types (they could be the same). If we combine the fmap and map into a trait, any type implements the trait could be called a "functor".
Unfortunately, Rust has no higher-kinded types, so it'll very difficult or even impossible to implement such a Functor trait.
Now we understand what a functor is. As mentioned before, an endofunctor is a functor maps a category C to itself. There could be a plethora of endofunctors, such as:
I hope you feel comfortable with this. Among the infinite endofunctors, there is one special endofunctor with two morphisms called "monad", which is M: C -> C. These morphisms are:
Unit: X -> M(X)
Join: M(M(X)) -> M(X)
Remember that X could be any object or morphism in the category C. In fact, the Option already has a method and_then:
+++ title = "A prime of monad with Python and Rust" date = "2021-1-8" cover = "" tags = ["Python", "Rust", "functional programming"] description = "Comparison of defining shared behaviors by OOP and algebraic data types" showFullContent = false +++
Overview
Monad is a fancy word abused by lots of mathematicians and programmers for education or entertainment. You might have read lots of articles or blogs but still have not grasped the essence. That's totally fine because I'm also...
Definitions
In the fore-mentioned quotes, there are four terms: monad, monoid, category and endofunctors. To understand monad, let's take the last three first:
monoid
category
endofunctor
Monoid == semigroup + identity element
So what is a "semigroup"? The definition is a set together with an associative binary operation. What makes a semigroup different is that a semigroup does not need to have one identity element ε and inverse elements for every element in the set.
Let's get started with the simplest set and operation: positive integers and addition +.
(a + b) + c == a + (b + c)
a + b != a
anda + b != b
If we add 0 to the set, then the semigroup becomes a monoid.
a + 0 = a
and0 + a == a
a != 0
, you cannot findb
to satisfya + b == 0
In category theory, the "set" in a monoid could be any algebraic structure (e.g. set, group, semigroup etc), not just a number set. The monoid we discussed above is the monoid in abstract algebra, however, in category theory this notion could be extended and generalized. We'll come back to this later.
Category == "objects" and "arrows"
A category is a collection of "objects" that are linked by "arrows". This is a rather abstract definition, so here we can just think the "objects" are sets and "arrows" are functions for now.
Note that the sets and functions here include all sets and functions you can imagine: they are not just numbers but could also be strings in Python.
If we apply the
func
above to every string in Python, the result is always inset_a
, so this is a mapping fromstr
set toset_a
.Again, the "objects" in a category are not necessarily sets. Formally, a category is defined as a set with three elements:
objects
morphisms
e.g.
f: A -> B
denotes the morphism from objectA
toB
composition of morphisms
e.g.
f: A -> B
andg: B -> C
can be composed asf∘g: A -> C
And the two axioms hold:
Associativity
If
f: A -> B
,g: B -> C
andh: C -> D
, then(f ∘ g) ∘ h == f ∘ (g ∘ h)
Identity
For every object
such that every morphism
X
, there exists a morphism:f: A -> X
satisfiesf ∘ id_x == f
andg: X -> B
satisfiesid_x ∘ g == g
.Let's take a simple example: objects are
A
andB
and morphisms aref: A -> B
andg: B -> A
. According to the axioms, we'll getid_A == f ∘ g
andid_b == g ∘ f
.In Python, we can use the type hint to model the category:
The
A
andB
can be any type likeint
orstr
.Endofunctor == a functor that maps category C to itself
Functor
Before we get into endofunctor, let's take a look on functor. A functor is a mapping from category
C
andD
:F: C -> D
:A
inC
toF(A)
inD
Maps any morphism
f: A -> B
inC
toF(f): F(A) -> F(B)
inD
id_A
is an identity morphism inC
, thenF(id_A)
must also be an identity morphism inD
F(f ∘ g) == F(f) ∘ F(g)
The key point is that a functor not only maps a object
A
toF(A)
but also maps the morphismf
toF(f)
. Let's take another category: objectu8
and a morphismf: u8 -> u8
in Rust.Obviously, the morphism
f: u8 -> u8
can be instantiated into infinite functions. Let's take the following two as an example:In Rust, there is a type called
Option<T>
andT
is a generic type.It has a function
map
:F: FnOnce(T) -> U
is a function type, which takes an argumentT
and returnsU
.T
andU
can be the same type or not, andFnOnce
means the function should only be called once. Again,U
andT
are generic types.If we ignore the
FnOnce
, it can be regarded as a morphismF: T -> U
. Back the theu8
category above, let's see whatOption<T>
can do on it.u8
(which is the only object),Option<T>::map
maps it toOption<u8>
f: u8 -> u8
, can we map it tog: Option<u8> -> Option<u8>
?Well, since there is no such a method in
Option<T>
, we have to implement it on our own.Say we have a function
F: Fn(U) -> T
, the goal is to convert it toG: Fn(Option<U>) -> Option<T>
:It works as expected. Does
fmap
satisfy the axioms?With
fmap
andmap
, we can maps every object and morphism in theu8
category to theOption<T>
category. More generally, we can maps every object and morphism in anyT
category toOption<U>
category, as long asT
andU
are legit Rust types (they could be the same). If we combine thefmap
andmap
into a trait, any type implements the trait could be called a "functor".Endofunctor
Now we understand what a functor is. As mentioned before, an endofunctor is a functor maps a category
C
to itself. There could be a plethora of endofunctors, such as:I hope you feel comfortable with this. Among the infinite endofunctors, there is one special endofunctor with two morphisms called "monad", which is
M: C -> C
. These morphisms are:Unit:
X -> M(X)
Join:
M(M(X)) -> M(X)
Remember that X could be any object or morphism in the category
C
. In fact, theOption
already has a methodand_then
: