Add I, IC, SEQ, PSET, MSET, CYC transforms

bawolk commented 8 years ago

This adds the basic transforms from the first part of Sedgewick & Flajolet (see p. 27). The I and IC are helper transforms that are used in restricting the sums or products to selected integers. They represent a "restricted" SEQ transform, i.e., SEQ(f)= 1+f+f^2+f^3+... but I({2*n+1})@f=f+f^3+f^5.... Also,SEQ(f) = I(f) + IC(f). The choice of the names I and IC is mine. Feel free to choose others, although I like the fact that they are short. For example, the Motzkin numbers are given by f=x*(I({0,1,2})@f). (See p, 68 of S&F). It also allows the removal of a separate partition function since PARTITION(f)=MSET(I(f)).

I thought about trying allow 2-ary functions, say RSEQ (I is actually implemented by a function I name rseq), so that RSEQ(f,{2*n+1})=f+f^3+f^5... but I'm not sure it would be any better, nor am I sure we could easily modify the parser to allow it.

bawolk commented 8 years ago

The identity above should have read: SEQ(f) = I(g)@f + IC(g)@f, where f is any series and g is a series of non-negative integers.

akc commented 8 years ago

Would it be possible, and would it make sense, for I and IC to respect Indet? E.g., with prec=10, we would then have I([1,2]) = [0,1,1,0,0,0,0,0,0,0], but I({1,2}) = [0,1,1]. If so, it would probably make sense to revisit (?) as well. Currently, rseq and (?) has a common part, namely

dropTrailing 0 [ x | x <- intPrefix f, x >= 0, x < precision f ]

I guess that's the part that would need be factored out and generalized/modified.

A few other minor things:

What does rseq stand for?
In rseqComp: \k -> if k == 0 then 1 else 0 can be simplified to (1-).
Could you add the nice identity SEQ(f) = I(g)@f + IC(g)@f to test/Properties.hs?

bawolk commented 8 years ago

I must admit that in proposing these transforms I did not consider the Indet issue. It might make sense to respect it for I and IC, but you would be the best judge of that. I see no harm in implementing it. I don't think (?) needs revisiting though. That function simply pulls out coefficients, based on a list of integers.

rseq stand for "restricted sequence". SEQ(f)=1+f+f^2+f^3+.... On the other hand, rseq([2*n+1])@f=f+f^3+f^5+..., i.e., it is a lot like SEQ, but "restricted" to certain powers.
Your rseqComp is indeed a lot simpler! But we will have to revise it to hand Indet.
I can add the identity, but if we respect Indet it will only hold if g has full precision. Should I revise the pull request accordingly?

akc commented 8 years ago

I've been thinking about this, but cannot quite see a clean solution yet.

I'm no longer sure if I, IC or (?) needs changing any more, but what I would like to see is that SEQ, PSET, MSET and CYC respect Indet.

E.g. if we have two objects of size 1 and one object of size 2 and no other objects, then PSET counts sets of such objects:

PSET([0,2,1]) = [1,2,2,2,1,0,0,0,0,0,0,0,0,0,0]

On the other hand, if we have two objects of size 1 and one object of size 2 and an unspecified number of larger objects then we would like

PSET({0,2,1}) = [1,2,2]

as we don't know how many objects of combined size 3 or bigger we can build.

It seems to me that the part that needs to be generalized is

powerset :: KnownNat n => Transform n
...
  where
    cs = intPrefix f
    gs = zipWith (\k c -> (1 + xpow k) ^ c) [1::Int ..] (tail cs)

I.e. we need to use all coefficients rather than the intPrefix.

If you can see how to solve this then please do update the pull request accordingly.

bawolk commented 8 years ago

SEQ and CYC already respect Indet since they operate directly on the input series. As for PSET and MSET, by the nature of the calculation, an Indet term in the input series can only affect the result beginning with that term's degree. Thus all we have to do is "mask" the result by setting to Indet any term with degree greater than or equal to the degree of the first Indet term. So now we do have:

$ hops 'PSET([0,2,1])'
{"hops":"PSET([0,2,1])","seq":[1,2,2,2,1,0,0,0,0,0,0,0,0,0,0]}
$ hops 'PSET({0,2,1})'
{"hops":"PSET({0,2,1})","seq":[1,2,2]}

I did add a test for the identity SEQ(f) = I(g)@f + IC(g)@f but I'm not very adept at QuickCheck so I wasn't sure how to to create random g's where all the terms are positive integers, so I tested with g = {2*n+1}. Also, SEQ, CYC, MSET, and PSET only make sense for an input series with the constant term = 0. To guarantee this for the QuickCheck tests, I tested the identity for x*f.

akc commented 8 years ago

I rewrote the definition of mask to propagate DZ (as well as Indent). E.g. now this works:

$ hops 'PSET({0,2,1,1/0}) .* {1,1,1,0}'
{"hops":"PSET({0,2,1,1/0}).*{1,1,1,0}","seq":[1,2,2]}

$ hops 'PSET({0,2,1}) .* {1,1,1,0}'
{"hops":"PSET({0,2,1}).*{1,1,1,0}","seq":[1,2,2,0]}

I've also added a few test based on the identities on p. 27 in Sedgewick & Flajolet.

akc / hops

Add I, IC, SEQ, PSET, MSET, CYC transforms #6