Allow more flexibility in cost of functions

egraphs-good / egglog

egraphs + datalog!

https://egraphs-good.github.io/egglog/

MIT License

459 stars 54 forks source link

Allow more flexibility in cost of functions #294

Open ricardoV94 opened 1 year ago

ricardoV94 commented 1 year ago

AFAICT costs must be constant, which limits the ability to reason about optimization.

One thing that would be interesting would be to parametrize cost based on the function inputs. For example the cost of an array elementwise addition operation could depend on the shapes of the inputs (with a default if these are not known). This could allow preferring programs were reductions are done before elementwise operations.

Interval reasoning on costs would also be nice. Even if we don't know the shape of an input, it should always be smaller after a reduction than before, and hence the cost of operating elemwise on it smaller afterwards (ignoring detail like memory layout)

ricardoV94 commented 1 year ago

Possibly related to #256

saulshanabrook commented 4 weeks ago

There is an in progress PR which attempts to address this https://github.com/egraphs-good/egglog/pull/355 Any feedback on it would be appreciated (if it would help your use case or not).

saulshanabrook commented 2 weeks ago

I just closed https://github.com/egraphs-good/egglog/pull/355, since we are not implementing this feature in core right now. details:

If you have particular use cases that would be solved by this PR or more generally would require some form of custom costs, place post a comment on the underlying issue https://github.com/egraphs-good/egglog/issues/294 with details of your use case. That is helpful to guide the discussion of if and how we want to support this type of feature.

ricardoV94 commented 2 weeks ago

One thing I realized is that eager local-based extraction may not be what I need. For instance, if I have a diagram graph I may want a locally more costly variant in both branches if a partial computation can be shared across both.

This would require coordinating extraction across the two separate branches.

saulshanabrook commented 2 weeks ago

I think one thing we could try to separate is the cost vs the extractor... There are DAG extractors which will account for reused expressions. The approach in this PR could work with that if you used a different extractor.

Like with this PR you could use a DAG extractors from extraction-gym

But yeah you could also do costs depending on which sub child you extract, which is a more general approach but harder to define declaratively.