add redis-based global cache prototype

CozySynthesizer / cozy

The collection synthesizer

https://cozy.uwplse.org

Apache License 2.0

209 stars 18 forks source link

add redis-based global cache prototype #98

Closed izgzhen closed 5 years ago

izgzhen commented 5 years ago

Off by default, use --global_cache parameter to turn on.

anhnamtran commented 5 years ago

Does this somehow address the issue brought up in #81 with regards to sharing a cache between cost models? I believe the one I have here also takes in account for the assumptions and funcs. Although, I'm unsure whether the approach I took is very elegant (or even correct)

izgzhen commented 5 years ago

Does this somehow address the issue brought up in #81 with regards to sharing a cache between cost models?

It doesn't. Do we know what funcs and assumptions are for most of our cases, will they change a lot?

izgzhen commented 5 years ago

it looks like most of the cases the funcs are empty. assumptions varies. path_conditions varies too.

Calvin-L commented 5 years ago

Just some quick intel:

funcs holds type information about the "extern" functions declared in the specification. In practice it will be the same across all CostModel objects ever constructed since there is only one specification per Cozy invocation. However, I recommend programming defensively: there is no reason for it to always be constant in the future. In particular, I think the tests instantiate many different CostModels with different funcs.

assumptions holds the assumptions and invariants declared in the specification. This will vary a lot because each synthesis thread handles a different method---and each method may have its own set of assumptions.

path_conditions is more subtle. The cost model needs to be able to estimate costs in any Context---and each context comes with different path conditions. In practice Cozy makes one CostModel object per method, and that CostModel can compare costs of expressions in any Context.

Calvin-L commented 5 years ago

I am very interested in the hit rate of the cache you are proposing here.

It might be very low: Cozy aggressively deduplicates expressions, so each expression the cost model sees will be unique. Or, it might be very high: Cozy discards all information when synthesis restarts with a new example.

anhnamtran commented 5 years ago

@Calvin-L using the cache here where it also uses assumptions and funcs as keys, I ran Cozy on maxbag.ds for 60s on a flushed cache 6 times (run 1 empty cache, run 2 has keys from run 1, etc.):

Run 1: 47.98% hit rate, 54 improvements
Run 2: 48.24% hit rate, 54 improvements
Run 3: 48.48% hit rate, 55 improvements
Run 4: 48.51% hit rate, 55 improvements
Run 5: 48.46% hit rate, 54 improvements
Run 6: 48.62% hit rate, 57 improvements

I'm not sure why there is a dip in run number 5. This was run on CSE Miles (one of the GPUs) and hit rates were calculated with hit / (hit + misses)

anhnamtran commented 5 years ago

I made a more thorough run of Cozy on several of the examples. The results can be found here. The README shows the conditions the rates were produced under and some comments made by Zhen.