Add Jones representation of braid groups and Jones polynomials of braid closures

fuglede commented 9 years ago

First off, this is my first ticket here, so please let me know about process related stuff that I missed.

As per this discussion on sage-devel, these commits contain methods for Braid and BraidGroup_class to allow for the calculation of the Temperley--Lieb--Jones representations of braid groups. While interesting in their own right, this also allows for a straightforward implementation of the (uncoloured) Jones polynomial of trace closures of braids, also included, which for a fixed number of braid strands is polynomial in the number of crossings. The algorithm uses Kauffman's diagrammatic description of the Temperley--Lieb algebra.

It should be noted that there is some overlap here with #17030 but as discussed on sage-devel, this is not necessarily a problem (although of course any code that is duplicated exactly should be coordinated between that ticket and this one).

One sillily todo is to add my own name (Søren Fuglede Jørgensen) to the list of authors. No matter how I've tried to do this, the building failes; probably due to encoding issues, I would imagine (but simply translating the string to unicode does not solve the issue).

The branch also contains a small bit of code clean-up cf. GH PR #47.

CC: @miguelmarco @amitjamadagni

Component: algebraic topology

Author: Søren Fuglede Jørgensen

Branch/Commit: 37ca1e8

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/19011

fuglede commented 9 years ago

Description changed:

--- 
+++ 
@@ -2,7 +2,7 @@

 As per [this discussion on sage-devel](https://groups.google.com/forum/#!topic/sage-devel/ByJURRvRgLg), these commits contain methods for Braid and BraidGroup_class to allow for the calculation of the Temperley--Lieb--Jones representations of braid groups. While interesting in their own right, this also allows for a straightforward implementation of the (uncoloured) Jones polynomial of trace closures of braids, also included, which for a fixed number of braid strands is polynomial in the number of crossings. The algorithm uses Kauffman's diagrammatic description of the Temperley--Lieb algebra.

-It should be noted that there is some overlap here with [Ticket #17030](https://github.com/sagemath/sage-prod/issues/17030) but as discussed on sage-devel, this is not necessarily a problem (although of course any code that is duplicated exactly should be coordinated between that ticket and this one).
+It should be noted that there is some overlap here with #17030 but as discussed on sage-devel, this is not necessarily a problem (although of course any code that is duplicated exactly should be coordinated between that ticket and this one).

 One sillily todo is to add my own name (Søren Fuglede Jørgensen) to the list of authors. No matter how I've tried to do this, the building failes; probably due to encoding issues, I would imagine (but simply translating the string to unicode does not solve the issue).

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 6e0f181 to 3e9d744

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

3e9d744 Added a few comments on runtime in practice

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 3e9d744 to 679f5c4

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

679f5c4 Assume sparsity of Jones representations

videlec commented 9 years ago

comment:4

In order to build with utf-8 characters in the source you need to declare the encoding at the very top of the file. Just add as the very first line

# -*- coding: utf-8 -*-

More information on https://www.python.org/dev/peps/pep-0263/

tscrim commented 9 years ago

comment:5

In case nobody has said it yet, welcome to Sage.

I know this isn't yet set to needs review, but some quick comments from a look-over (from my curiosity):

Could you remove the changes to diagram_algebras.py, this will be taken care of in #18720 (which does some large refactoring of the classes)?
For latex, use single backticks, e.g., `A + B \neq 5`, instead of dollar signs.
Make the first sentence for method definitions short and succinct. For example:

def foo(x):
    r"""
    Return bar.

    This is where you put a longer explanation. This is a python convention and
    makes it clear what the method does and then has this information for the
    interested user who wants more details.
    """

Also note the r""" for the beginning of the docstring, this makes it into "raw" format, where the string gets interpreted as written, whereas when you don't have it, backslash \ gets interpreted as a special character (ex. \n is a newline). For what you have here, this does not appear to be necessary, but I always include it anytime I have latex code for extra safety/my paranoia.
For markov_trace, I would write it like this:

def markov_trace(self, variab='A', ring=IntegerRing()):
    R = LaurentPolynomialRing(ring, variab)
    A = R.gens()[0]
    delta = -A**2 - A**(-2)
    n = self.strands()

    def weighted_trace(b, d):
        return (A**(2*i) - A**(-2*i))/(A**2 - A**(-2)) * b.TL_matrix(d, variab, ring).trace()

    trace = sum(weighted_trace(self, d, variab, ring) for d in range(n+1) if (n+d) % 2 == 0)
    return trace / (-delta)**n

You should try to avoid abbreviations in method names as it makes it slightly harder to detect with tab completion and can expand those out quickly using tab completion.
For the caching of polynomials, I would have a private method which does the computation in a specific ring with a specific variable, cache that, and have the public method take that output and do a subs on the result from the private method. This way you can't cache the "same" polynomial over and over again (and your private methods can call each other knowing what their output will be). For an optimization, the defaults for the public method could be None and if they are both None, then return immediately the result of the private method.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 679f5c4 to 7861a72

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`14c6129`	`Add myself to list of authors`
`79ce5f4`	`Various fixes to TL-representation related docs`
`a481855`	`Simplify and de-generalize Markov trace method`
`ff49dd1`	`Revert "Clean-up; loops changed to list comprehensions"`
`7861a72`	Change $ to ` in LaTeX expressions

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 7861a72 to 182b298

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

182b298 Add example to TL representation calculation

fuglede commented 9 years ago

comment:8

Replying to @videlec:

In order to build with utf-8 characters in the source you need to declare the encoding at the very top of the file. Just add as the very first line
# -*- coding: utf-8 -*-
More information on https://www.python.org/dev/peps/pep-0263/

Thanks, done!

fuglede commented 9 years ago

comment:9

Replying to @tscrim:

In case nobody has said it yet, welcome to Sage.

Thanks!

I know this isn't yet set to needs review, but some quick comments from a look-over (from my curiosity):

Thanks for the comments! I'll go through them individually.

Could you remove the changes to diagram_algebras.py, this will be taken care of in #18720 (which does some large refactoring of the classes)?

Done.

For latex, use single backticks, e.g., `A + B \neq 5`, instead of dollar signs.

Done. This is also an issue in all existing methods in this module though. Moreover, the developer guide currently allows the use of dollar signs.

Make the first sentence for method definitions short and succinct. For example:

def foo(x):
    r"""
    Return bar.

    This is where you put a longer explanation. This is a python convention and
    makes it clear what the method does and then has this information for the
    interested user who wants more details.
    """

I made several of them shorter.

Also note the r""" for the beginning of the docstring, this makes it into "raw" format, where the string gets interpreted as written, whereas when you don't have it, backslash \ gets interpreted as a special character (ex. \n is a newline). For what you have here, this does not appear to be necessary, but I always include it anytime I have latex code for extra safety/my paranoia.

Yep, I checked those before committing. Indeed, r""" is not used when not necessary in the existing module documentation either.

For markov_trace, I would write it like this:
def markov_trace(self, variab='A', ring=IntegerRing()):
    R = LaurentPolynomialRing(ring, variab)
    A = R.gens()[0]
    delta = -A**2 - A**(-2)
    n = self.strands()

    def weighted_trace(b, d):
        return (A**(2*i) - A**(-2*i))/(A**2 - A**(-2)) * b.TL_matrix(d, variab, ring).trace()

    trace = sum(weighted_trace(self, d, variab, ring) for d in range(n+1) if (n+d) % 2 == 0)
    return trace / (-delta)**n
You should try to avoid abbreviations in method names as it makes it slightly harder to detect with tab completion and can expand those out quickly using tab completion.

Makes sense: The logic behind including qint as a separate function was to make the expression look both more familiar to those who know the formula, and a bit less clumsy; the abbreviation was mainly there to ensure that the expression would stay within the 79 columns of pep8. Anyway, I changed it to something more or less identical to what you suggest. Possibly worth noting is that if more quantum topology makes its way into sage, this particular method would probably be refactored as a result.

For the caching of polynomials, I would have a private method which does the computation in a specific ring with a specific variable, cache that, and have the public method take that output and do a subs on the result from the private method. This way you can't cache the "same" polynomial over and over again (and your private methods can call each other knowing what their output will be). For an optimization, the defaults for the public method could be None and if they are both None, then return immediately the result of the private method.

I'm not sure that I see for which parts of the added code you want to do this: for Braid.jones_polynomial in particular, or more generally? Indeed it would seem to me that no matter the answer, the same would have to be done for all existing methods in the module (and as such, doing it through decoration might prove beneficial?) Also, even though I certainly understand the point of what you're saying, I don't think I can come up with a use case where it makes a difference (but my imagination might be largely limited to what I've had to do with the classes myself already).

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ff004ad Fix typo in documentation

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 182b298 to ff004ad

tscrim commented 9 years ago

comment:11

The biggest thing is create_TL_rep (which I would perhaps call TL_representation if this is a common abbreviation, otherwise I'd spell it out as temperley_leib_representation), and suppose you create it as the default, but then you want it in a different variable name. Now it has to do the computation all over again (and caches both results, creating redundancy). This could come up if someone wants to look at this polynomial over a large range of finite fields.

fuglede commented 9 years ago

comment:12

Yeah, looking at it for a collection of different rings certainly makes sense. I don't think I've seen anyone actually consider that, though.

One thing that is common, though, is to not view the entries as elements of the Laurent polynomial ring but rather a quotient thereof (since this is what generalizes to higher genus surfaces in TQFT), so there might be a point in allowing for some freedom in that direction.

It's worth noting, also, that the calculation of the representation on the generators is reasonably fast, compared to actually manipulating the resulting matrices: I know that the point is to save both memory and time, but in my testing, it is significantly faster (roughly a factor of ~20) to calculate the representation in the desired ring to begin with, than it is to calculate it for IntegerRing and then use subs entry-wise to obtain a matrix with entries in the desired ring.

Finally, I should note that an earlier version didn't actually use the caching decorator. It's included now since it can play a role for braid groups on a large number of strings. Then the number of strings is large, it is the combinatorial part, i.e. the first loop in create_TL_rep that takes up most of the time (in my testing, for the braid group on 14 strings, roughly 75% of the time is spent there). Putting together these observations, to accommodate for the use case where one wants to save memory while working with a large number of base rings, it would be sensible to refactor and cache only the first part of the method, reconstructing the actual matrices on every execution. This will, though, slow down what I think is the more common use case of manipulating only the representation over IntegerRing, so I'm not a big fan of doing that.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from ff004ad to 57255ad

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

57255ad Add more comments to TL representation algorithm

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

83b247a Rename create_TL_rep to TL_representation

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 57255ad to 83b247a

fuglede commented 9 years ago

comment:15

I ran a test to match the Jones polynomial values against known values: The values match with 2961 knots with up to 12 crossings from KnotInfo in 69 seconds (and no mismatches were found).

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 83b247a to 8202cf5

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

8202cf5 Fix typo in documentation

tscrim commented 9 years ago

comment:17

That's why I suggested doing the optimization of just returning the polynomial for the default values (rather than running a useless subs). Sometimes taking a slowdown in smaller examples is worth it for the significant speedup in larger examples. Without seeing your timings, I can't tell (I don't have access to Sage right now). However it sounds like refactoring the code and caching only the result of the first loop is best.

For variables/rings, from my experience it is more common to take as input the variable, rather than creating the ring itself in the method (unless there is a default input). See, e.g., sage.combinat.q_analogues.

Also a micro-optimization: if len(forest) == 0: is slower than if not forest:.

fuglede commented 9 years ago

comment:18

What I meant was that if we were to cache only the first loop, we would see some slowdown for the default values as well, which we do not if we cache the entire thing.

tscrim commented 9 years ago

comment:19

However I could very easily cause things to be doubly/triply cached very easily by having a division somewhere, which typically pushes things into fraction fields. Somewhere I think we have to take a slowdown for more robust code.

Side note: Doing a direct call of the polynomial should be faster than subs from looking at the code. However I think we need a better/faster way to simply convert a polynomial from one ring to another.

miguelmarco commented 9 years ago

comment:20

Some timings for conversion between univariate laurent polynomial rings:


sage: R.<t> = LaurentPolynomialRing(ZZ)
sage: f = (t^-2+t^2) ^2 + 5*t -1 
sage: f
t^-4 + 1 + 5*t + t^4
sage: S.<a> = LaurentPolynomialRing(IntegerModRing(7))
sage: d = f.dict()
sage: d
{-4: 1, 0: 1, 1: 5, 4: 1}
sage: %time f.subs(t=a)
CPU times: user 9 ms, sys: 1 ms, total: 10 ms
Wall time: 9.29 ms
a^-4 + 1 + 5*a + a^4
sage: %time S(f)
CPU times: user 1 ms, sys: 0 ns, total: 1 ms
Wall time: 1.31 ms
a^-4 + 1 + 5*a + a^4
sage: %time S(d)
CPU times: user 1 ms, sys: 0 ns, total: 1 ms
Wall time: 1.02 ms
a^-4 + 1 + 5*a + a^4
sage: %time S(d)
CPU times: user 1 ms, sys: 0 ns, total: 1 ms
Wall time: 293 µs
a^-4 + 1 + 5*a + a^4
sage: d1 = (f+1).dict()
sage: %time S(d1)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 211 µs
a^-4 + 2 + 5*a + a^4
sage: f1 = f+1   
sage: %time S(f1)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 147 µs
a^-4 + 2 + 5*a + a^4

Definitely .subs is not a good idea. Direct conversion or construction from the underlying dictionary are much faster.

Of course, this might deppend a lot on the rings involved, and the dictionart approach might not work if we talk about fraction fields or multivariate polynomial rings.

miguelmarco commented 9 years ago

comment:21

BTW, even if some complicated conversions need to be done, .subs is still slow compared to direct call:

sage: %time f(1/(a+1))
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 450 µs
sage: %time f(a)
CPU times: user 1e+03 µs, sys: 0 ns, total: 1e+03 µs
Wall time: 476 µs

Note that the first one gives as result an element of the fraction field of S.

fuglede commented 9 years ago

comment:22

Note that there actually aren't any fractions involved a priori here, although it does seem like the code is currently producing silly "fractions" like -1/A^2 (rather than -A^-2) that actually belong to the ring itself.

fuglede commented 9 years ago

comment:23

So how's the following: we always cache the first part of the algorithm which is independent of the ring. For the second part, we cache the result if and only if no ring is specified. Also, determining the base ring from the parent of the variable seems like a reasonable idea to me.

miguelmarco commented 9 years ago

comment:24

As a general principle, i think that would be the best choice.

But maybe for soem particular cases it would make sense to cache the final result. That deppends, in this case, in which would be the most usual scenario.

If the usual scenario is: the user computes most Jones polynomial in the same variable, then caching the final result produces no double caching, and we would win speed (since we don't have to convert from the cached value). Besides, if these computations will be usually called a lot of times (i.e. they will be put in a loop), that small speed gain can actually become important.

If, on the other hand, the expected situation is a user computing Jones polynomials in many different rings, then the double caching (with its coinsequent cache misses) will actually become a big problem.

In this case, since it seems that computing the Jones polynomial is much slower than conmverting from a cached value, i think the risks of cache misses outweight the advantages of not needing to convert from the cached value. So i would vote to cache the result in a fixed ring, and then convert to whichever ring the user wants.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`3bc4d50`	`Update variable specification for Jones representations`
`132dccd`	`Microoptimisation: Faster conditional`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 8202cf5 to 132dccd

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 132dccd to 465e404

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

465e404 Introduce partial caching for Jones representation calculations

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

41ab2f3 More consistent variable names

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 465e404 to 41ab2f3

fuglede commented 9 years ago

comment:28

The above commits introduce caching in a number of ways, building on your comments. For the TL representation, the actual TL-part of the algorithm (which is independent of choices of variables) is cached in all cases. The actual (memory heavy) result is stored for the typical case where no variable is specified. The Markov traces and Jones polynomials have been set up so that the combinatorial part of the calculation is done over the general ring; the result of this calculation is cached, and when the user specifies a variable, substitution is carried out on this cached result. I think this strikes the perfect balance.

Also, as suggested by tscrim, the paradigm of supplying the variable as a generator of some polynomial ring, rather than specifying a string and the ring, has been pushed.

One thing that I would still like to fix is the fact that the methods occasionally return elements of fraction fields when they really shouldn't. Fixing this boils down to supplying an answer to this ask sagemath question.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

722cdad Avoid fraction fields whenever possible

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 41ab2f3 to 722cdad

fuglede commented 9 years ago

comment:30

Fixed the issue in my previous comment with some refactoring to force sage to stay out of the field of fractions of the polynomial ring whenever possible. Even though this appears to require some silly manipulations, this even resulted in a speed gain: the 2961 knots mentioned earlier can now be checked against known data in 38 seconds (down from 69 seconds).

So, what's the process: should I flag as needs review?

tscrim commented 9 years ago

comment:31

As soon as you're ready for us to do a formal review, set it to needs_review.

I bet a good chuck of your speedup comes from the fact you're not doing division of a polynomial ring (and, in fact, you could remove one additional polynomial multiplication by being smarter about how you do your q-int). There is also floor division //, which does not go to the fraction field. However that is quite an impressive speedup.

fuglede commented 9 years ago

comment:32

Thanks for checking it out and for pointing those out. Using // provides a significant speed-up for the q-int calculation as well. Also, it's a complete life-saver in jones_polynomial where I had to work around what appears to be an unrelated bug. The benchmark still runs in roughly the same time, but this is a lot more elegant!

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 722cdad to 67523a3

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

67523a3 Fixes to how quotients of polynomials are handled

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

cd979e1 Simplify logic for Jones polynomial output

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 67523a3 to cd979e1

fchapoton commented 9 years ago

comment:36

Could you please add an EXAMPLES:: section to _TL_action ?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from cd979e1 to a022f34

sagemath / sage

Add Jones representation of braid groups and Jones polynomials of braid closures #19011