sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.36k stars 462 forks source link

Use Cython directive binding=True to get signatures for cython methods #26254

Open kwankyu opened 6 years ago

kwankyu commented 6 years ago

In requests for help for cythonized built-in methods, the signature of the method is not shown, unlike normal python methods. For an example,

sage: a=17 
sage: a.quo_rem?

Docstring:     
   Returns the quotient and the remainder of self divided by other.
   Note that the remainder returned is always either zero or of the
   same sign as other.

   INPUT:

   * "other" - the divisor

   OUTPUT:

   * "q" - the quotient of self/other

   * "r" - the remainder of self/other

   EXAMPLES:

      sage: z = Integer(231)
      sage: z.quo_rem(2)
      (115, 1)
...

To fix this, we set Cython directive binding=True. Thus we buy

for slight performance degradation due to increased overhead cost of calling cython methods.

Related tickets: #19100, #20860, #18192

Depends on #32509 Depends on #33864

CC: @jdemeyer @tscrim @mkoeppe

Component: user interface

Author: Kwankyu Lee, Tobias Diez

Branch/Commit: public/26254 @ 326f19c

Reviewer: Tobias Diez, ...

Issue created by migration from https://trac.sagemath.org/ticket/26254

kwankyu commented 5 years ago
comment:1

It seems this file

https://github.com/ipython/ipython/blob/master/IPython/core/oinspect.py

is responsible for this issue. For me, it would take some time to scrutinize what this does though.

kwankyu commented 5 years ago
comment:2

A fix is to redefine IPython.core.oinspect.getargspec to use sage.misc.sageinspect.sage_getargspec

kwankyu commented 5 years ago
comment:3

Replying to @kwankyu:

A fix is to redefine IPython.core.oinspect.getargspec to use sage.misc.sageinspect.sage_getargspec

This is already done in sage.repl.ipython_extension.init_inspector. But apparently with no effect, strangely.

kwankyu commented 5 years ago
comment:4

It turns out that the problem is with IPython.core.oinspect.inspector._get_def, which calls Python's inspect.signature via IPython.utils.signatures module.

This problem is nothing to do with sage_getargspec.

kwankyu commented 5 years ago
comment:5

We may just wait for future sage based on Python 3 with inspect.signature supporting cython.

jhpalmieri commented 5 years ago
comment:7

I don't see the signature in a Python 3 build of Sage, either.

kwankyu commented 5 years ago
comment:8

Replying to @jhpalmieri:

I don't see the signature in a Python 3 build of Sage, either.

Right.

I don't remember exactly what I meant in my last comment. Perhaps I expected Cython someday support the new signature module shipped with Python 3. Now I don't have any clear idea what should be done on what side.

kwankyu commented 5 years ago

New commits:

0c78cdfTurn on cython directive binding
kwankyu commented 5 years ago

Commit: 0c78cdf

kwankyu commented 5 years ago

Branch: public/26254

kwankyu commented 5 years ago
comment:10

Based on this discussion:

https://stackoverflow.com/questions/46033277/how-to-introspect-a-function-defined-in-a-cython-c-extension-module

and consulting:

https://cython.readthedocs.io/en/latest/src/userguide/source_files_and_compilation.html#compiler-directives

I made the last commit. Please checkout and try.

kwankyu commented 5 years ago

Author: Kwankyu Lee

dimpase commented 5 years ago
comment:13

With your patch I get a bunch of doctest errors, of the kind

sage -t --warn-long 55.8 src/sage/graphs/strongly_regular_db.pyx
**********************************************************************
File "src/sage/graphs/strongly_regular_db.pyx", line 1156, in sage.graphs.strongly_regular_db.is_RSHCD
Failed example:
    t = is_RSHCD(64,27,10,12); t
Expected:
    [<built-in function SRG_from_RSHCD>, 64, 27, 10, 12]
Got:
    [<cyfunction SRG_from_RSHCD at 0x7f8616d09890>, 64, 27, 10, 12]
**********************************************************************
sage -t --warn-long 55.8 src/sage/misc/latex.py
**********************************************************************
File "src/sage/misc/latex.py", line 561, in sage.misc.latex.has_latex_attr
Failed example:
    T._latex_()
Expected:
    Traceback (most recent call last):
    ...
    TypeError: descriptor '_latex_' of 'sage.matrix.matrix0.Matrix' object needs an argument
Got:
    <BLANKLINE>
    Traceback (most recent call last):
      File "/home/scratch2/dimpase/sage/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 681, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/scratch2/dimpase/sage/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 1105, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.misc.latex.has_latex_attr[5]>", line 1, in <module>
        T._latex_()
    TypeError: unbound method cython_function_or_method object must be called with Matrix_integer_dense instance as first argument (got nothing instead)

--- this is of course not a surpise, but it needs to be fixed on this ticket.

Otherwise, I like it - e.g. notice how much more informative the error messages are.

nbruin commented 5 years ago
comment:14

This comes with the penalty of producing a wrapped object every time a method is accessed on a cython object. I suspect cythonized access avoids that, so it may be that in most scenarios this doesn't come with a performance penalty, but one should check carefully that sage doesn't rely on the situations where it does.

Also, there is a reason why cython.binding==False by default: that's the behaviour built-in methods exhibit: [].insert returns a built-in method insert of list object ... rather than a bound method, and cython by default does the same. If you want more informative tracemacks, wouldn't it be better to solve it in such a way that straight-up CPython (and its C extension classes; of which cython is a special case) also benefit?

kwankyu commented 5 years ago
comment:15

Replying to @nbruin:

This comes with the penalty of producing a wrapped object every time a method is accessed on a cython object. I suspect cythonized access avoids that, so it may be that in most scenarios this doesn't come with a performance penalty, but one should check carefully that sage doesn't rely on the situations where it does.

Also, there is a reason why cython.binding==False by default: that's the behaviour built-in methods exhibit: [].insert returns a built-in method insert of list object ... rather than a bound method, and cython by default does the same.

[].insert? shows correct signature. So built-in methods can behave well with respect to introspection. Then why cythonized built-in methods do not? How can we make cythonized built-in methos behave well like standard built-in methods of python?

kwankyu commented 5 years ago
comment:16

[].insert? shows correct signature. So built-in methods can behave well with respect to introspection. Then why cythonized built-in methods do not? How can we make cythonized built-in methods behave well like standard built-in methods of python?

An answer can be found here:

https://stackoverflow.com/questions/1104823/python-c-extension-method-signatures-for-documentation/1104893

and

https://docs.python.org/3/howto/clinic.html

Now it seems to me that cython should do a better job in making cythonized built-ins more introspectable.

kwankyu commented 5 years ago
comment:17

To summarize the current situation, there are two options:

Option 1: We accept the current patch, which turns on cython directive "binding=True" so that all cythonized methods become bound methods that already support the inspect.signature module well. If we take this path, then there is nothing for us to do except fixing a few doctests.

Option 2: We wait for upstream cython fixes that will make all cythonized built-in methods properly support the inspect.signature module. This is the path that standard built-in methods follow. We don't know when the upstream fix would be available.

Please give your preference and why.

dimpase commented 5 years ago
comment:19

To go with option 1, we need benchmarking results on whether it affects the performance a lot.

kwankyu commented 5 years ago
comment:20

Replying to @dimpase:

To go with option 1, we need benchmarking results on whether it affects the performance a lot.

If it affects any bit of the runtime performance in other aspect than introspection, then option 1 should be discarded. I think this should be decided not by experiments but analysis of how python and cython works.

dimpase commented 5 years ago
comment:21

We can configure this in build time, to begin with. It is helpful for debugging - I would not care about a 5% or 15% performance hit, if error messages made more sense.

simon-king-jena commented 5 years ago
comment:22

Replying to @dimpase:

We can configure this in build time, to begin with. It is helpful for debugging - I would not care about a 5% or 15% performance hit, if error messages made more sense.

I would.

embray commented 5 years ago
comment:23

This is what Jeroen has been working on for like, literally the last year, perhaps longer :)

Yes, the solution is to use binding=True to enable use of cyfunctions. However, using cyfunctions across the board can introduce a significant performance penalty in many cases, as the Python interpreter has some built-in optimizations for built-in functions that don't work for cyfunctions.

Jeroen has been fighting for a series of PEPs that would overhaul Python's function type hierarchy in such a way that the basic function type can be extended (e.g. as with Cython's cyfunction) while still keeping those optimizations working.

So while this seems like it should be an easy problem to solve, it's completely non-trivial.

Point being, let's not duplicate effort here.

kwankyu commented 5 years ago
comment:24

Replying to @embray:

Point being, let's not duplicate effort here.

Thanks for the expert advice.

embray commented 4 years ago
comment:25

Ticket retargeted after milestone closed

mkoeppe commented 4 years ago
comment:26

Moving tickets to milestone sage-9.2 based on a review of last modification date, branch status, and severity.

kwankyu commented 4 years ago
comment:27

Replying to @embray:

This is what Jeroen has been working on for like, literally the last year, perhaps longer :)

Jeroen has been fighting for a series of PEPs that would overhaul Python's function type hierarchy in such a way that the basic function type can be extended (e.g. as with Cython's cyfunction) while still keeping those optimizations working.

I searched for these PEPs, and reached to

I am curious if and how theses PEPs would eventually solve the problem of this ticket. I only guess that after the PEPs made into CPython, Cython is updated to use the new CPython features, and then the signature issue in Sage is automatically fixed. Am I right?

mkoeppe commented 4 years ago
comment:28

It would be interesting to know whether the upcoming Cython 3 (#29863) has improvements in this direction

embray commented 4 years ago
comment:29

Replying to @kwankyu:

Replying to @embray:

This is what Jeroen has been working on for like, literally the last year, perhaps longer :)

Jeroen has been fighting for a series of PEPs that would overhaul Python's function type hierarchy in such a way that the basic function type can be extended (e.g. as with Cython's cyfunction) while still keeping those optimizations working.

I searched for these PEPs, and reached to

I am curious if and how theses PEPs would eventually solve the problem of this ticket. I only guess that after the PEPs made into CPython, Cython is updated to use the new CPython features, and then the signature issue in Sage is automatically fixed. Am I right?

That's correct--this would allow us to use Cython's own function subclass, which includes support for better signature documentation among other things, without losing any performance.

mkoeppe commented 3 years ago
comment:31

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.

mkoeppe commented 3 years ago
comment:32

Setting a new milestone for this ticket based on a cursory review.

mkoeppe commented 3 years ago
comment:34

binding=True will be the default in Cython 3 (https://cython.readthedocs.io/en/latest/src/userguide/source_files_and_compilation.html#compiler-directives), so eventually we will make this switch anyway; so why not now.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

f36275aTurn on cython directive binding
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from 0c78cdf to f36275a

mkoeppe commented 3 years ago
comment:37

Rebased on 9.5.beta0

mkoeppe commented 3 years ago
comment:38

I have set it to "needs review" so that the patchbot runs on it.

mkoeppe commented 3 years ago
comment:39

The old ticket #22747 attempted to use binding as well

jhpalmieri commented 3 years ago
comment:40

Are there performance penalties for doing this, perhaps that Cython 3 is going to address?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Changed commit from f36275a to 36ec493

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 3 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

36ec493Reformat the comment
kwankyu commented 3 years ago
comment:42

Replying to @jhpalmieri:

Are there performance penalties for doing this...?

How can we see the performance penalty?

jhpalmieri commented 3 years ago
comment:43

Replying to @kwankyu:

Replying to @jhpalmieri:

Are there performance penalties for doing this...?

How can we see the performance penalty?

Try builds with and without and compare some timings?

kwankyu commented 3 years ago
comment:44

Replying to @jhpalmieri:

Replying to @kwankyu:

Replying to @jhpalmieri:

Are there performance penalties for doing this...?

How can we see the performance penalty?

Try builds with and without and compare some timings?

I tried a very simple script like: timeit('a=17;a.quo_rem(5); del a'), and find no difference.

I wonder what is a proper way to see the difference...

mkoeppe commented 3 years ago

Dependencies: #32509

jhpalmieri commented 3 years ago
comment:46

Replying to @kwankyu:

Replying to @jhpalmieri:

Replying to @kwankyu:

Replying to @jhpalmieri:

Are there performance penalties for doing this...?

How can we see the performance penalty?

Try builds with and without and compare some timings?

I tried a very simple script like: timeit('a=17;a.quo_rem(5); del a'), and find no difference.

I wonder what is a proper way to see the difference...

I ran ./sage -t --long src/sage/matrix/*.pyx a few times:

Develop: average time 83.3 seconds.

This ticket: average time 93.1 seconds.

I also tried ./sage -t --long src/sage/matrix/matrix_gfpn_dense.pyx a few times:

Develop: average time 28.4 seconds

This ticket: average time 37.0 seconds

jhpalmieri commented 3 years ago
comment:47

Some other files in matrix showed very little difference, so maybe the slowdown only occurs in certain types of operations.

nbruin commented 2 years ago
comment:48

From:

https://cython.readthedocs.io/en/latest/src/userguide/source_files_and_compilation.html

... When enabled, functions will bind to an instance when looked up as a class attribute

I don't know what triggers the binding behaviour, but I imagine there may be a code path that runs into this and perhaps ends up not binding anyway (thus creating overhead) or ends up binding in a way that was previously done in a more efficient way (cached perhaps?)

The timings above show the impact can be quite significant: I think too high a penalty to incur in general. Note that the documentation also says:

Changed in version 3.0.0: Default changed from False to True

so figuring out what's causing the slowdown is a prereq to upgrading to 3.0.0 (once that finally is released). If there's a particular scenario where it's bad to have binding, we might just be able to turn it off in those cases.

tobiasdiez commented 2 years ago
comment:50

What's the status here? The performance issues don't seem to be too bad, especially since they apparently only affect certain functions/modules.

Binding=true is also required for #30884 since the decorator library internally uses inspection on the decorated function. So if one wants to decorate cython functions, then they have to be "bound".

kwankyu commented 2 years ago
comment:51

Replying to @tobiasdiez:

What's the status here? The performance issues don't seem to be too bad, especially since they apparently only affect certain functions/modules.

I agree with comment:48. The penalty seems significant to me. We need to know how much damage we would get and where, and to see if there is a way to reduce the damage.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 2 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

52b5089Merge remote-tracking branch 'origin/develop' into public/26254
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 2 years ago

Changed commit from 36ec493 to 52b5089