bottler / iisignature

Iterated integral signature calculations
MIT License
94 stars 18 forks source link

possible bug in logsig? #6

Closed patrick-kidger closed 5 years ago

patrick-kidger commented 5 years ago

I believe the coefficients that are returned in iisignature.logsig as supposed to be the coefficients in the Lyndon basis? In which case clearly the coefficient of e.g. [1, [1, 2]] is just the coefficient of 112 when the logsignature is written expanded in tensor space; in particular we see that every value returned by iisignature.logsig(..., iisignature.prepare(...)) should also appear somewhere in iisignature.logsig(..., iisignature.prepare(...), 'x').

However this does not appear to be the case. To reproduce:

import numpy as np
import iisignature as ii
x = np.random.uniform(size=(3, 3))
v = ii.logsig(x, ii.prepare(3, 3))
vx = ii.logsig(x, ii.prepare(3, 3), 'x')

for i, e in enumerate(v):
    if not any(np.isclose(vx, e)):
        print("Failed on {}".format(i))
bottler commented 5 years ago

Thanks for the question!

The expanded log signature is the sum over hall words w of (coefficient * P_w) where P_w is the basis element of the Free Lie Algebra corresponding to w. You can see section 1.4.2 of my thesis (on page 13) at https://github.com/bottler/phd-docs/blob/master/thesis.pdf for an example. Your intuition is looking at my equation 1.11 and saying "every letter appears on its own somewhere". Alternatively, you are claiming that P_w always contains coefficient 1 in some word (which is true in the Lyndon basis for w itself in fact), and that there is such a word which does not appear in P_u for another Hall word u. This happens not to be the case in general.

In 3 dimensions at level 3, you are observing it to not be the case. Take this program based on yours:

import numpy as np
import iisignature as ii

x = np.random.uniform(size=(3, 3))
v = ii.logsig(x, ii.prepare(3, 3))
vx = ii.logsig(x, ii.prepare(3, 3), 'x')
labels = ii.basis(ii.prepare(3,3))

for i, (e,lab) in enumerate(zip(v,labels)):
    if not any(np.isclose(vx, e)):
        print("Failed on {}".format((i,e,lab)))

I see that it fails only on one basis element, namely [[1,3],2], which corresponds to the Lyndon word 132. Let's investigate. We find P(132). It is [[1,3],2]=[13-31,2]=132-312-213+231. There's another Lyndon word with the same letters in fact. It is 123. P(123) is [1,[2,3]]=[1,23-32]=123-132-231+321. So we see that the only words with coefficient 1 in P(132), namely 132 and 231, also appear in P(123). So there is no word where the coefficient of 132 stands alone.

Your intuition is true for 112 because 112 is a Lyndon word with no anagram Lyndon words.

patrick-kidger commented 5 years ago

Ah, I see! You're quite correct; thanks for the explanation. :)