sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.39k stars 472 forks source link

Some optimizations for addition in combinatorial free modules and dict_* methods #20680

Closed tscrim closed 7 years ago

tscrim commented 8 years ago

We improve the speed of methods like dict_addition in order to improve the speed of addition in CombinatorialFreeModule.

CC: @sagetrac-sage-combinat @nthiery

Component: performance

Keywords: combinatorial free module, addition, days79

Author: Travis Scrimshaw, Nicolas M. Thiéry

Branch/Commit: 95aacfa

Reviewer: Nicolas M. Thiéry, Jeroen Demeyer

Issue created by migration from https://trac.sagemath.org/ticket/20680

jdemeyer commented 7 years ago
comment:40

Replying to @tscrim:

At least, I could not get __radd__ on a Cython class

http://cython.readthedocs.io/en/latest/src/userguide/special_methods.html#arithmetic-methods

tscrim commented 7 years ago
comment:41

Replying to @jdemeyer:

Replying to @tscrim:

Replying to @jdemeyer:

Regarding [comment:27], I think the phrase "values are zero after the addition has been performed" is still not clear. Do you mean values such that a * x is non-zero and y is non-zero but a * x + y is zero? If so, what is the rationale for keeping those zeros but not the cases where y is zero and a * x is zero?

I think you're misparsing something.

If I am misparsing something, it probably means that the documentation wasn't clear. My question remains: if remove_zeros=False, exactly which keys will appear with a zero value?

All of them. The point is that if it is true, then remove them.

Edit: That claim might be somewhat vague. If there is a key with a value of 0, then it stays.

tscrim commented 7 years ago
comment:42

Replying to @jdemeyer:

Replying to @tscrim:

At least, I could not get __radd__ on a Cython class

http://cython.readthedocs.io/en/latest/src/userguide/special_methods.html#arithmetic-methods

So I was trying to add an element and a list. I implemented an add that looked like this:

def __add__(self, other):
    if isinstance(other, list):
        return self + self.parent()(list[0])
    return super(self, Foo).__add__(other)

What I saw happen when it was list + Foo was __add__ was being called with self being list. If it got to __radd__, then it would have been after this call, but I was getting an error trying to make the super call that wasn't getting caught.

tscrim commented 7 years ago
comment:43

Replying to @jdemeyer:

Replying to @tscrim:

Actually, slightly radical proposal: How about removing this altogether as a separate spkg since it is independent of Sage?

Even if you do that, you could still develop it within Sage and then split it off.

Nicolas wasn't in favor of doing this as it is just one file.

jdemeyer commented 7 years ago
comment:44

Replying to @tscrim:

So I was trying to add an element and a list. I implemented an add that looked like this:

def __add__(self, other):
    if isinstance(other, list):
        return self + self.parent()(list[0])
    return super(self, Foo).__add__(other)

What I saw happen when it was list + Foo was __add__ was being called with self being list.

Right, that is how __add__ works in a cdef class. For this reason, I recommend against using the name self in this case. I prefer

def __add__(left, right):

which shows the more symmetric nature of the arguments.

If it got to __radd__

What do you mean? There is no __radd__ for a cdef class.

I was getting an error trying to make the super call that wasn't getting caught.

Logically, if self is not an instance of Foo, then super(self, Foo) would be an error.

jdemeyer commented 7 years ago
comment:45

Replying to @tscrim:

Nicolas wasn't in favor of doing this as it is just one file.

Makes sense.

tscrim commented 7 years ago
comment:46

Replying to @jdemeyer:

Replying to @tscrim:

So I was trying to add an element and a list. I implemented an add that looked like this:

def __add__(self, other):
    if isinstance(other, list):
        return self + self.parent()(list[0])
    return super(self, Foo).__add__(other)

What I saw happen when it was list + Foo was __add__ was being called with self being list.

Right, that is how __add__ works in a cdef class. For this reason, I recommend against using the name self in this case. I prefer ...

Ah, sorry, I didn't read the link you sent and misunderstood what you were trying to say. However, that is my point, we would have to check for this in __mul__, which takes a few extra cycles. If instead we use a function that has a fixed signature (or at least, semantically), then we don't loose those cycles.

Does comment:41 help with the remove_zeros?

tscrim commented 7 years ago

Changed keywords from combinatorial free module, addition to combinatorial free module, addition, days79

tscrim commented 7 years ago

Changed reviewer from Nicolas M. Thiéry to Nicolas M. Thiéry, Jeroen Demeyer

nthiery commented 7 years ago
comment:47

If remove_zeros=True (and the input have no zero values themselves), then the output is guaranteed to have no zero values. Otherwise the some zero value may be left in the output. Which ones is voluntarily left undefined.

The point is that clearing zero values has a cost and there are cases where the user want to postpone the clearing until the end of a long series of operations.

Jeroen: does the above clarify the remove_zeros option? If yes, I'll add it to the documentation.

nthiery commented 7 years ago
comment:48

Hi Jeroen,

Thanks for the feedback!

The current design (compared to a class with methods) has two advantages for such a low level use case:

(and you know that I am a fan of OO :-) )

nthiery commented 7 years ago
comment:49

Travis: for some reason, 9570c65ca4bb27fe8ee3dbee82b6ee42c4acb108 includes unrelated changes in finite_monoids; could you revert those?

jdemeyer commented 7 years ago
comment:50

Replying to @tscrim:

Ah, sorry, I didn't read the link you sent and misunderstood what you were trying to say. However, that is my point, we would have to check for this in __mul__, which takes a few extra cycles. If instead we use a function that has a fixed signature (or at least, semantically), then we don't loose those cycles.

This is an interesting (but wrong) argument. It shows that there are a lot of misconceptions about efficiency in !Python/Cython code.

What is correct:

(1) Arithmetic operators (like + implemented as __add__) are fast.

(2) isinstance() in Cython is fast.

(3) Python function calls are slow.

In turns out that (1) + (2) is actually faster than (3).

jdemeyer commented 7 years ago
comment:51

Replying to @nthiery:

If remove_zeros=True (and the input have no zero values themselves), then the output is guaranteed to have no zero values. Otherwise the some zero value may be left in the output. Which ones is voluntarily left undefined.

The point is that clearing zero values has a cost and there are cases where the user want to postpone the clearing until the end of a long series of operations.

Jeroen: does the above clarify the remove_zeros option? If yes, I'll add it to the documentation.

Absolutely (much better than Travis's explanation; sorry Travis).

jdemeyer commented 7 years ago
comment:52

Regarding efficiency, let met add this: the module blas_dict seems to be used only from Python (not Cython) code, so these efficiency considerations are probably irrelevant anyway.

tscrim commented 7 years ago
comment:53

Replying to @jdemeyer:

Regarding efficiency, let met add this: the module blas_dict seems to be used only from Python (not Cython) code, so these efficiency considerations are probably irrelevant anyway.

Nicolas and I are planning to move CombinatorialFreeModuleElement to a Cython class, so I would think it would be C function calls in that case?

jdemeyer commented 7 years ago
comment:54

First of all, I am not going to insist on implementing this as a subclass. You know better than me how this code is actually used.

Still, some comments:

Replying to @nthiery:

The current design (compared to a class with methods) has two advantages for such a low level use case:

  • following a widely used pattern for basic linear algebra operations (used even in OO languages)

I believe the standard implementation languages of BLAS are Fortran and C and those are not OO languages.

  • not requiring to coerce / cast inputs when they are provided as plain dictionaries;

Sure, this is indeed a potential disadvantage (but also the only one). Wouldn't it work to just keep the attributes __monomials or _monomial_coefficients an instance of that type?

the same holds for output

Not really for output since it would be a dict subclass. So anything expecting a dict would still work.

jdemeyer commented 7 years ago
comment:55

Replying to @tscrim:

Nicolas and I are planning to move CombinatorialFreeModuleElement to a Cython class, so I would think it would be C function calls in that case?

Yes, provided that you cimport (not import) the functions. This requires you to add a .pxd file. I guess you should do that on this ticket.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 9570c65 to 93c03b2

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

93c03b2Added pxd file and updating documentation.
tscrim commented 7 years ago
comment:57

I forgot to add the .pxd file; thanks for the reminder. I also added Nicolas' explanation of remove_zeros to the documentation.

jdemeyer commented 7 years ago
comment:58

If you care about efficiency, then iaxpy should have the prototype

cpdef int iaxpy(....) except -1

Returning an int is faster than returning the Python None.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

3fce47eChanging return type of iaxpy and some small doc tweaks.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 93c03b2 to 3fce47e

tscrim commented 7 years ago
comment:60

Replying to @jdemeyer:

If you care about efficiency, then iaxpy should have the prototype

cpdef int iaxpy(....) except -1

Returning an int is faster than returning the Python None.

Done. This resulted in a 2 microsecond speedup for the n=1000 benchmark for me.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 3fce47e to be360a3

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f0996ebMerge branch 'public/combinat/improve_dict_addition-20680' of trac.sagemath.org:sage into public/combinat/improve_dict_addition-20680
be360a3Fixing one of the doctest failures.
tscrim commented 7 years ago
comment:62

So there are two doctest failures:

The failure in categories/finite_dimensional_algebras_with_basis.py is reordering, and I've fixed it.

The one in misc/dev_tools.py is essentially trivial, but it requires some thought as to what the correct change should be. I'm thinking we should find something else that has a unique object name (perhaps 'QQ'?). However, I want your opinions.

Once that is fixed, perhaps we can get this into Sage?

nthiery commented 7 years ago
comment:63

Hi Travis,

Thanks for checking this out. +1 on the first change. For the second, I agree that it's good enough to replace sum by something else with a unique name. The only thing is that this test is about exercising import_statement for objects that are defined as aliases. So we need to find some object like this (haven't found one yet).

Then, yes, let's get this in Sage!

tscrim commented 7 years ago
comment:64

Replying to @nthiery:

For the second, I agree that it's good enough to replace sum by something else with a unique name. The only thing is that this test is about exercising import_statement for objects that are defined as aliases. So we need to find some object like this (haven't found one yet).

I haven't been able to find one either. We can do something slightly artificial like importing something in the doctest as an alias and checking against that. Should we go that route?

nthiery commented 7 years ago
comment:65

Or maybe just remove the test: the feature already has test which seem equivalent, and if we can't find another example, it's not a critical feature anyway.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9df803cRemoving the other doctest failure.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from be360a3 to 9df803c

tscrim commented 7 years ago
comment:67

Okay, I've removed it. So if there are no other objections, then I believe we can set this to a positive review. Next up will be to cythonize CFMElement...

tscrim commented 7 years ago

Changed author from Travis Scrimshaw to Travis Scrimshaw, Nicolas M. Thiéry

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

95aacfa20680: utf-8 fix
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 9df803c to 95aacfa

nthiery commented 7 years ago
comment:69

Hi,

I checked the patchbot reports. I fixed the non-ascii issue. The startup-module report is normal. There are two python3 issue: one in a speed-critical section which we had discussed and decided to be a necessary evil. The other is a false positive (existing cmp in a line of code that was respaced).

Hence positive review!

jdemeyer commented 7 years ago
comment:71

Replying to @nthiery:

There are two python3 issue: one in a speed-critical section which we had discussed and decided to be a necessary evil.

That's a non-issue. Cython != Python and Cython does support .iteritems() for dicts.

vbraun commented 7 years ago

Changed branch from public/combinat/improve_dict_addition-20680 to 95aacfa