Open ChrisJefferson opened 8 years ago
If we use a divide-and-conquer technique, using the fact * is commutative, performance is much improved.
I think you mean associative here. As you use it later. The obvious examples of non-associativity and Lie algebra elements, and more generally elements of algebras given by structure constant tables, but there are plenty of other things in packages. The question I guess is whether people would use Product
for such objects expecting the result to be the left-normed product. A more difficult example is floats, which are non-associative if you care about rounding errors.
The documentation of Product
does not specify the order of operations, so one could argue that the divide-and-conquer implementation is just as correct as the current implementation. What do people feel about this? We could just clarify in the documentation that order of multiplication is not defined, and pehaps add LeftNormedProduct
for which it is defined.
The simple and logical thing to do is to install the divide-and-conquer method for IsAssociativeElementCollection
. Unfortunately this would run afoul of the special short-cut for plain lists which exists precisely to avoid having to check the type of every element of the list before starting. At the cost of making an ugly piece of code uglier, one could check for lists that know they are homogeneous in which case IsAssociativeElementCollection
is quick to check. Perhaps more appealing would be to just check for lists of internal cyclotomics (T_PLIST_CYC) which would capture the case above. For lists of permutations or matrices over finite fields, for instance, divide-and-conquer doesn't gain anything anyway.
On Mon, Mar 21, 2016 at 04:05:32AM -0700, Steve Linton wrote:
The documentation of
Product
does not specify the order of operations, so one could argue that the divide-and-conquer implementation is just as correct as the current implementation. What do people feel about this? We could just clarify in the documentation that order of multiplication is not defined, and pehaps addLeftNormedProduct
for which it is defined.
I don't think that someone would use Product
in a situation where the order of multiplication is important.
Therefore, I would just change the code (lib/coll.gi (from line 1252)) to the divide-and-conquer version and maybe make the documentation of Product
slightly more precise:
Change "returns the product of the elements of" to "returns the associative product of the elements of".
By the way, for another variant of the code see PageSource(Factorial);
which also got a significant speedup when I changed it to use divide-and-conquer.
For lists of permutations or matrices over finite fields, for instance, divide-and-conquer doesn't gain anything anyway.
What exactly does make the difference? Either way, there will be (n-1) multiplications for the n long list. However, with the way @ChrisJefferson proposes the depth would become logn, which could make a big difference for hpc-gap.
I have to admit, I do not entirely understand where the speed gain is from in the particular example. Nevertheless, I am all for implementing the new method. I probably would even go that far as to say that Product should be associative from now on, and maybe introduce a NonAssociativeProduct, or something similar?
@hungaborhorvath : The gain comes from the size of the intermediate objects.
Multiplying n elements of the free group together produces a string of length n, so it is a good idea to try to avoid producing those. Multiplying n elements over m points together produces another permutation on m points, so there would be no advantage (it would probably be slightly slower, from book-keeping, but mostly be the same speed I suspect. I should try it!)
@ChrisJefferson Ah, yes, with simple multiplication, the sum of the sizes is O(n^2), while with the other method only O(nlogn).
Anyway, I think that for parallel multiplication this should have significant speed boost for arbitrary associative products, not only for these special situations, so I am strongly for it.
At least for lists of integers (or rationals, or ...) the gain comes from more efficient single multiplications (try to always multiply numbers of roughly the same size). For the example above in a free group it is less obvious and must depend on the implementation of the arithmetic of associative words.
Of course, if all single multiplications have about the same cost, then the proposed change just introduces overhead. E.g.,
gap> l := [1..10000000]*Z(3);;
gap> Product(l);; time;
344
gap> prod(l);; time;
1556
gap> prd(l);; time;
400
where prod
is the function from the first post and prd
is the following function, copied from the code of Factorial
:
prd := function(l)
local pr;
pr := function(l, i, j)
local bound, len, res, l2, k;
bound := 30;
len := j+1-i;
if len < bound then
res := 1;
for k in [i..j] do
res := res*l[k];
od;
return res;
fi;
l2 := QuoInt(len,2);
return pr(l,i,i+l2)*pr(l,i+l2+1,j);
end;
return pr(l, 1, Length(l));
end;
I think this is a prototypical example for method selection. For some objects, for which multiplication `builds up'', a divide-and-conquer approach is faster. For others it just causes overhead. Thus the solution is to keep the generic method but have an extra have a method for
Productunder the filters (e.g.)
IsAssocWordCollection(free group elements) or
IsElementOfFpGroupCollection` (fp group elements).
The only problem with method selection is plain lists with mutable entries. These cannot remember their type, so the whole list has to be traversed every time you call a method on it. Product and a few other functions have a specific short-cut to make sure this doesn't happen, and this code would naturally live in the short-cut.
I think I might add this as a new method, and then users who want can use it.
At the moment, product is defined in the "obvious way", repeatedly multiplying along the list. This can be very slow in some cases, for example:
If we use a divide-and-conquer technique, using the fact * is commutative, performance is much improved. Here is a simple implementation.
My main thought is, is it reasonable, and possible, to add this behaviour to
Product
? I'm tempted to try adding it and see what breaks. Does anyone have any good examples of non-associative products?