Closed izgzhen closed 5 years ago
this very simple implementation can have quite good hit rate:
0.0
0.0
0.0
0.0
0.2
0.16666666666666666
0.2857142857142857
0.375
0.4444444444444444
0.5
0.5454545454545454
0.5833333333333334
0.6153846153846154
0.6428571428571429
0.6666666666666666
0.6875
0.7058823529411765
0.7222222222222222
0.7368421052631579
0.75
0.7619047619047619
0.7727272727272727
0.782608695652174
0.7916666666666666
0.76
0.7307692307692307
0.7037037037037037
0.7142857142857143
0.6896551724137931
0.7
0.6774193548387096
0.6875
0.696969696969697
0.7058823529411765
0.7142857142857143
0.7222222222222222
0.7027027027027027
0.7105263157894737
0.717948717948718
0.725
0.7317073170731707
0.7380952380952381
0.7441860465116279
show by running at the same time watch -n 1 cat /tmp/cozy_cache*
but the total numbers for a query is actually low, caused by insufficient sharing between each cost_model instance I guess. I am not sure it is safe to share, but sharing should be a simple hack
also, I felt like cozy becomes faster with this. However, there is not strict validation yet
lru_cache from functools also works
I added a memcached version -- yeah I am kinda crazy
after a few warmup for memcached, cache hit rate for all cost_model instances mostly are 80% and up
Sharing a cache between cost model instances is dangerous since there are member variables on each cost model that affect its behavior. Where you inserted the cache, I think that includes only self.assumptions
and self.funcs
. If you want a global cache you will need to include that information in the cache key as well.
Sharing a cache between cost model instances is dangerous since there are member variables on each cost model that affect its behavior. Where you inserted the cache, I think that includes only self.assumptions and self.funcs. If you want a global cache you will need to include that information in the cache key as well.
Good point, we need to figure out what to share and what not to share later. If the acceleration is significant, it might be worthy to structure cozy's code to make the correctness property obvious.
CostModel._compare
is a pure function of four inputs---just think of the self
argument as a real input of equal importance to the others.
The cost model could be a plain old function of two expressions and many other inputs, but it helps efficiency a lot to do some up-front setup using the inputs that don't change very much. In particular, sending assumptions to the solver once and re-using the same solver over and over provides massive speedup.
This design is a tradeoff in code clarity: the cost model code itself is more difficult to follow, but client code is easier to read since all the extra parameters that the cost model needs have been abstracted away.
PoC