Closed arihantparsoya closed 7 years ago
I have a couple of ideas. First, if you recreate the ManyToOneReplacer
every time you want to use it, it is not going to be faster than just using replace_all
The setup time for creating all the states is much higher than just matching without it. It only pays off if you match multiple times or if you serialize the matcher and restore it. Otherwise the construction cost dominates the overall time.
A second thing are your constraints. To get the most speed out of them, you would need to split them up and turn them into individual constraints, i.e. get rid of the And
inside. Then each constraint can be evaluated seperately. As an example, the constraint of the first pattern is
cons(And(FreeQ(List(a_, b_, c_, d_), x_), ZeroQ(Add(Mul(a_, d_), Mul(b_, c_))), PosQ(Mul(Pow(a_, Integer(-1)), b_, Pow(c_, Integer(-1)), d_))), (c, b, x, d, a))
Instead, you should try to keep them to as few variables as possible, so that they can be evaluated early:
FreeQ(a_, x_)
FreeQ(b_, x_)
FreeQ(c_, x_)
FreeQ(d_, x_)
ZeroQ(Add(Mul(a_, d_), Mul(b_, c_)))
PosQ(Mul(Pow(a_, Integer(-1)), b_, Pow(c_, Integer(-1)), d_)))
For FreeQ
I think you don't need sympy to evaluate it. You can basically just use something like this:
class FreeQ(Constraint):
def __init__(self, expr, var):
self.expr = expr if isinstance(expr, str) else expr.name
self.var = var if isinstance(var, str) else var.name
def __call__(self, substitution):
return substitution[self.var] not in substitution[self.expr]
@property
def variables(self):
return frozenset([self.expr, self.var])
def with_renamed_vars(self, renaming):
return FreeQ(
renaming.get(self.expr, self.expr),
renaming.get(self.var, self.var)
)
def __eq__(self, other):
return isinstance(other, FreeQ) and other.expr == self.expr and other.var == self.var
def __hash__(self):
return hash((self.expr, self.var))
I did not test this though. The first point is the most important one though. You only get a speedup, if you amortize the setup cost of the ManyToOneMatcher
.
I changed the code to stop recreating ManyToOneReplacer
multiple times and the speed is good now.
I knew that its better to have indivdual constraint rather than having a common cons()
for all constraints since it helps ManyToOneReplacer
to backtrack early. But I couldn't figure out how to implement logical statements such as And( , Or())
in MatchPy Constraints. I will definately try to seperate out FreeQ
the way you described.
Thanks !
Hi, while running the test suit of rubi(which consists of ~1700 integrands). The speed of matching decreases while going through the test suit. Sometimes, the test suit gets stuck for long time.
I have seperated FreeQ
as you suggested. If I solve the integrands(which take more time in the test suit) indivdually, they run fine.
I have not tried profiling. I will try now.
The code is in my rubi4 SymPy branch.
Test suit: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/tests/test_rubi_test_suit.py all ReplacementRule: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/patterns.py
The tests will start running if you run bin/test sympy/rubi/tests/test_rubi_test_suit.py
in the terminal from main sympy directory. For now I have made the tests such that it shows the intergrand which is being integrated(I have also added some documentation here).
PFA .prof
file for first 121 tests.
program.prof.zip
I think there must be problem in my own code. I will let you know as soon as I find it.
Apparently SumSimplerQ
is missing so I cannot run it locally:
File "sympy\rubi\tests\test_rubi_test_suit.py", line 2, in <module>
from sympy.rubi.rubi import rubi_integrate
File "sympy\rubi\rubi.py", line 1, in <module>
from .patterns import rubi_object
File "sympy\rubi\patterns.py", line 5, in <module>
from .constraint import cons, FreeQ
File sympy\rubi\constraint.py", line 4, in <module>
from .matchpy2sympy import matchpy2sympy
File "sympy\rubi\matchpy2sympy.py", line 4, in <module>
from .utility_function import (Int, ZeroQ, NonzeroQ, List, Log, RemoveContent,
ImportError: cannot import name 'SumSimplerQ'
There is a function ExpandIntegrand
which expands the the integrands such as (x + 1)**2
into x**2 + 2*x + 1
. Then each indivdual term is integrated again.
I created two graphs(One with expandintegrand and another without it). I found that ExpandIntegrand
is taking majority of time in matchpy2sympy
.
I have committed SumSimplerQ
just now. You can update your branch
By looking at the profiling you posted, I identified one bottleneck that I have hopefully fixed in the last commit. Would you be willing to try it out before I release a new version? You can check out MatchPy yourself and install it in development mode (via python setup.py develop
). The many-to-one matcher was doing some redundant work for a lot of matches which should only be done now once.
I tried installing MatchPy using Github now but it is giving the following error( I also tried installing it previously and I got the same error):
parsoyaarihant@MacBook-Pro-4 ~/Downloads/matchpy-master python setup.py install
zip_safe flag not set; analyzing archive contents...
Installed /Users/parsoyaarihant/Downloads/matchpy-master/.eggs/pytest_runner-2.11.1-py3.6.egg
Searching for setuptools_scm>=1.7.0
Reading https://pypi.python.org/simple/setuptools_scm/
Downloading https://pypi.python.org/packages/6c/8c/86bb5eb353184a8cfb6488d8f1d286e2028907c6d24fc9e604e6aa40ac8e/setuptools_scm-1.15.6-py3.6.egg#md5=342049ba90a3dde0d6372456fe92884a
Best match: setuptools-scm 1.15.6
Processing setuptools_scm-1.15.6-py3.6.egg
Moving setuptools_scm-1.15.6-py3.6.egg to /Users/parsoyaarihant/Downloads/matchpy-master/.eggs
Installed /Users/parsoyaarihant/Downloads/matchpy-master/.eggs/setuptools_scm-1.15.6-py3.6.egg
Traceback (most recent call last):
File "setup.py", line 45, in <module>
'graphs': ['graphviz>=0.5,<0.6'],
File "/Users/parsoyaarihant/anaconda/lib/python3.6/distutils/core.py", line 108, in setup
_setup_distribution = dist = klass(attrs)
File "/Users/parsoyaarihant/anaconda/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/dist.py", line 318, in __init__
File "/Users/parsoyaarihant/anaconda/lib/python3.6/distutils/dist.py", line 281, in __init__
self.finalize_options()
File "/Users/parsoyaarihant/anaconda/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/dist.py", line 375, in finalize_options
File "/Users/parsoyaarihant/Downloads/matchpy-master/.eggs/setuptools_scm-1.15.6-py3.6.egg/setuptools_scm/integration.py", line 22, in version_keyword
File "/Users/parsoyaarihant/Downloads/matchpy-master/.eggs/setuptools_scm-1.15.6-py3.6.egg/setuptools_scm/__init__.py", line 119, in get_version
File "/Users/parsoyaarihant/Downloads/matchpy-master/.eggs/setuptools_scm-1.15.6-py3.6.egg/setuptools_scm/__init__.py", line 97, in _do_parse
LookupError: setuptools-scm was unable to detect version for '/Users/parsoyaarihant/Downloads/matchpy-master'.
Make sure you're either building from a fully intact git repository or PyPI tarballs. Most other sources (such as GitHub's tarballs, a git checkout without the .git folder) don't contain the necessary metadata and will not work.
For example, if you're using pip, instead of https://github.com/user/proj/archive/master.zip use git+https://github.com/user/proj.git#egg=proj
Did you do a checkout or just download the source code? You need to do a checkout, otherwise it does not work, because the version is determined based on git tags.
Oh, by the way, I can recommend you to use parametrized tests. Then you have a seperate test case for every example and pytest will track for you whiche ones pass/fail.
Edit: I am not sure whether that will work without pytest, because sympy seems to use its own runner.
I did checkout now and installed it. However, I am unable to import ManyToOneReplacer
.
Python 3.6.0 |Anaconda custom (x86_64)| (default, Dec 23 2016, 13:19:00)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from matchpy import ManyToOneReplacer
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'ManyToOneReplacer'
Update: I will fix this issue myself in some time.
SymPy doesn't allow to import external python libraries. That is why it doesn't use py.test. Sympy has similar tests to pytest but I am not using them currently.
ManyToOneReplacer
issue if fixed. I will create a .prof
file and compare it with previous test.
There is still a bottleneck in the ManyToOneMatcher with the bipartite graphs. Changing this is a bit more complicated but should also increase the speed.
Okay, the test suit is not running at good speed(probably due to ExpandIntegrand
) but its not getting stuck like it use to earlier. I will try to optimise ExpandIntegrand
.
I changed the generation of the bipartite graphs so that they should not take as long anymore. This might speed up your test suit a bit more. Now ExpandIntegrand
is indeed the biggest bottleneck when I run it.
Thanks !
Hi, I implemented ~80 rules using
ManyToOneReplacer
for Rubi integration and MatchPy was able to match the subjects very quickly. However, when I added ~550 rules, there was significant decrease in speed while matching even smaller subjects.It would be great if you could advice us on how to solve this issue.
Here are the files for our project:
Patterns: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/patterns.py Constraint: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/constraint.py All Rubi Files: https://github.com/parsoyaarihant/sympy/tree/rubi4/sympy/rubi