Computing functorially with linear relations

jpfairbanks commented 4 years ago

@epatters @mehalter, I came up with this approach for computing with linear relations. It relies on the idea that a LinRel f:R^n-R^m can be stored as a basis of a subspace in R^{n + m}. If this is right, we can use the @instance macro and the functor function to start computing some semantics for Circuits, Markov Chains, Chemical Systems, etc.

module LinRel
using LinearAlgebra

import Base: size

struct Basis{T}
    columns::Matrix{T}
end

size(b::Basis, i::Int) = size(b.columns, i)

struct RelBasis{T}
    dom::Basis{T}
    codom::Basis{T}
end

RelBasis(a::Matrix, b::Matrix) = RelBasis(Basis(a), Basis(b))
RelBasis(a::Basis, b::Matrix) = RelBasis(a, Basis(b))
RelBasis(a::Matrix, b::Basis) = RelBasis(Basis(a), b)

function compose(v::RelBasis, u::RelBasis)
    A = v.dom.columns
    B = v.codom.columns
    C = u.dom.columns
    D = u.codom.columns
    X = B\C
    @show X
    @show B*X
    @show C - B*X
    @show matches = sum(abs.(C - B*X), dims=1) .< 1e-12
    vecs = []
    for ( i,b ) in enumerate(matches)
        if b
            push!(vecs, (A*X[:,i], D[:, i]))
        end
    end
    @show vecs
    Â₀ = hcat(map(first, vecs)...)
    D₀ = hcat(map(last, vecs)...)
    return RelBasis(Â₀, Matrix(D₀))
end

function otimes(v::Basis,u::Basis)
    a = v.columns
    b = u.columns
    T = eltype(a)
    n,m = size(a)
    k,l = size(b)
    return vcat(hcat(a, zeros(T, n,l)), hcat(zeros(T, k,m),b))
end

otimes(v::RelBasis, u::RelBasis) = RelBasis(otimes(v.dom,u.dom), otimes(v.codom,u.codom))

#tests

u = [1.0  1;
     1    0]
v = [1.0  1;
     1    0;
     1   -1]
w = [1.0  1 2;
     1    3 1;
     1   -1 0]
y = [1.0  1 1;
     3    5 1]

h = compose(RelBasis(u, v), RelBasis(w,y))

println(h.dom.columns)
println(h.codom.columns)

h = otimes(RelBasis(u, v), RelBasis(w,y))
println(h.dom.columns)
println(h.codom.columns)

end

jpfairbanks commented 4 years ago

@philzook58 pointed out some relevant ideas:

http://www.philipzucker.com/linear-relation-algebra-of-circuits-with-hmatrix/
https://github.com/JuliaPolyhedra/Polyhedra.jl
Lecture notes on polyhedral algorithms https://inf.ethz.ch/personal/fukudak/lect/pclect/notes2015/PolyComp2015.pdf
Sketch of Julia Implementation https://gist.github.com/philzook58/51289c6b016b30d7a6ec75bdb247b945

jpfairbanks commented 4 years ago

I think this is a better way to do the linear algebra.

Let

f: R^n -> R^m
g: R^m -> R^k

x,y in f iff Ax = By
y,z in g iff Cy = Dz

that is
[A B] [x]  = 0   & [C D] [y]  = 0 
      [y]                [z]

We can compute the composition as below:

[ A B 0 ]  [x]  = [0]
[ 0 C D ]  [y]    [0]
           [z]

Compute this nullspace and then project out the columns corresponding to B and C.

philzook58 commented 4 years ago

Another comment:

If one can get polynomial matrices to work, a la https://github.com/neveritt/PolynomialMatrices.jl it opens up a lot of interesting applications such as control systems, signal processing, and inductor/capaciator circuits. Getting the kind of overloading is the sort of thing that excites me about julia

philzook58 commented 4 years ago

I'm trying things out over here. https://github.com/philzook58/Rel.jl/blob/master/linrel_julia.ipynb I tried disentangling the subspace representation from the relational parts and renamed the representations to NullSpace and RangeSpace.

I like James' version where you keep the "input" and "output" of the relation separate. It does add some clarity to some constructions

I'm struggling a bit to use Polyhedra.jl. Also, people may want to check out the related package LazySets.jl https://github.com/JuliaReach/LazySets.jl . A reasonable backend for some subcategory of Rel (Convex?) .

epatters commented 4 years ago

Cool, thanks for sharing this. If I understand correctly, meet and join on LinSpace should return LinSpace but return NullSpace and RangeSpace, respectively. I guess you're still working out the underlying representation.

One question is whether to always maintain a basis for the space or simply a set of spanning vectors.

jpfairbanks commented 4 years ago

@epatters @mehalter, what if we took the GLA implementation we have for LinearMaps and constructed LinRel out of LinMap using Spans?

epatters commented 4 years ago

We had a productive conversation about this on Slack today. It might be helpful to record the highlights.

jpfairbanks commented 4 years ago

Highlights from my perspective @philzook58 let me know if you missed anything. I changed the notation to put it in the context of pullbacks for composition of relation as spans.

When constructing Linear Relations, we build Spans of Linear Maps just like Rel is Span(Set).
In order to compose linear relations we need to form pullbacks of cospans of linear maps

jpfairbanks commented 4 years ago

I think this allows you to compute K = B⁺range(C) == C⁺range(B), which you can get from rank revealing QR and SVD. If we can reduce big matrix inversions (taking the converse relation of a linear map) to a network of tiny SVDs and matrix-matrix products, that is really cool.

jpfairbanks commented 4 years ago

Evan pointed out that I just used inverse instead of pseudo inverse. So the above only works when the matrices are full rank ie. Relations that are also LinearMaps

philzook58 commented 4 years ago

I've been circling back around to this idea of spans and pullbacks in FinVect. It's very helpful to my understanding, thank you.

I think there are layers to disentangle here. Asking how to construct pullbacks in FinVect in the absence of discussions about linear relations or spans is interesting in an of itself.

I agree with the nullspace characterization of the pullback (equation 2). I don't see the necessity or desire to turn to pseudo inverses at that point however. That stacked matrix can be constructed, the nullspace routine can be called, and then the appropriate projections F and G can directly be read out as slices of the returned matrix. The computed object of the pullback will be the number of columns in this matrix.

I have a prototype I've been fiddling with in (and here is where you all cover me in gas and set me aflame) python.

I'm representing the objects as dimensions of the vector spaces and morphisms in FinVect by numpy matrices. In fact, I'm subclassing ndarray with FinVect properties.

    def pullback(f,g):
        assert( f.cod == g.cod  ) # Most have common codomain
        # horizontally stack the two operations. 
        null = np.linalg.null_space( np.hstack([f,-g]) )
        p1 = FinVect(null[:f.dom, :])
        p2 = FinVect(null[f.dom:, :])

        # These projections form a commutative square. 
        # This should be true by construction assuming I didn't goof up

        assert( np.allclose(  f @ p1,  g @ p2 )  )
        def univ(u,v):
            assert(u.dom == v.dom )
            assert( np.allclose(f @ u , g @ v ) )# given a new square. This means u,v have to inject into the nullspace

            assert( np.allclose( p1.T @ u , p2.T @ v )) # this should follow? I don't super see why.
            return p1.T @ u 
            ''' 
             u == p1 @ e and v == p2 @ e are the triangle conditions fullfilled?

             f @ p1 == g @ p2 pullback is commutative

             question: Is  e == p1.T @ u  imply e == p2.T @ v?

             p1 @ p1.T @ u, p1 @ p1.T ~ I or projection?

             v == p2 @ p1.T @ u ?
             g @ v == f @ u = g @ p2 @ p1.T @ u == f @ p1 @ p1.T @ u
             f @ u == f @ p1 @ p1.T @ u  -- we're never gonna be able to delete f from here?
            '''

        return null.shape[1] , p1, p2, univ

What I feel very uncomfortable about is my construction of the universal property. I think it might make sense, but it isn't clear why exactly.

epatters commented 4 years ago

Right, I think we are all in agreement that pseudoinverses aren't much help here and that we have to work with the nullspaces directly. Julia, like NumPy, has a nullspace routine that computes a basis of the nullspace. That said, I remember that @jpfairbanks asked on the Julia Slack about computing nullspaces and was told that it is generally tricky due to numerical instability. Maybe he can elaborate.

Anyway, let's try to understand the universal property. Using James' notation above, the universal property of the span X <(F)- K -(G)> Y (pardon my ASCII art) is that whenever you have another span X <(M)- L -(N)> Y giving a commutative square w.r.t. to the cospanX -(B)> W <(C)- Y, then there exists a unique map H: K -> L such that M H = F and N H = G. In your code, you construct a basis of the nullspace of [B -C] and identify K with the standard vector space of dimension equal to the nullity of [B -C]. This forces the maps F and G to be projections of those basis vectors, which is exactly what you have. Now, if you have maps M and N as above, then their stacked columns belong to the null space and hence can be uniquely expressed in terms of the basis that you computed. The matrix containing the coefficients of this basis expansion is the matrix H satisfying the universal property.

I think that makes sense. Does it make sense to you?

jpfairbanks commented 4 years ago

From the numerical perspective that nullspace operator is based on SVD, which is stable enough for this kind of thing and allows for somewhat smooth approximations by truncating the SVD. So you can do “approximate null space”

philzook58 commented 4 years ago

@jpfairbanks I agree that the SVD is the right choice here and makes the most sense from a numerical perspective.

@epatters That does help. The way you state it makes a lot more sense I think. What I have written above for the universal property is incorrect and was a conjecture.


def univ(u,v):
            assert(u.dom == v.dom )
            assert( np.allclose(f @ u , g @ v ) )# given a new square. This means u,v have to inject into the nullspace

            return null.T @ np.vstack([u,v]) 
            # == p1 @ u + p2 @ v

Here's some maybe helpful expansion on the

Because nullspace outputs an orthonormal basis: P = null @ null.T is a projection operator null.T @ null is the identity if np.vstack([u,v]) is in the range of null (which is the square commuting condition) , then P @ np.vstack([u,v]) == np.vstack([u,v])

expanding P into blocks this is p1 @ p1.T @ u + p1 @ p2.T @ v = u p2 @ p1.T @ u + p2 @ p2.T @ v = v

The universal morphism is e = null.T @ np.vstack([u,v]) == p1 @ u + p2 @ v

The triangle commuting conditions for the universal morphism are

u == p1 @ e == p1 @ (p1.T @ u + p2.T @ v ) == u v == p2 @ e == p2 @ (p1.T @ u + p2.T @ v ) == v

epatters commented 2 years ago

I transferred this issue here because this repo is the new home for GLA and I think the discussion is worth preserving.

AlgebraicJulia / GraphicalLinearAlgebra.jl

Computing functorially with linear relations #3