egraphs-good / extraction-gym

benchmarking e-graph extraction
MIT License
27 stars 16 forks source link

Add good_lp based (with highs solver) extractor #24

Open averyanalex opened 9 months ago

averyanalex commented 9 months ago

This is a port of cbc extractor to good_lp rust library with highs ilp solver. It is faster and requires no runtime deps (highs links statically). Also good_lp based extractor is solver-agnostic and ilp solver can be easily changed.

Bench results:

cumulative time for ilp-cbc: 219968ms
cumulative time for ilp-highs: 145051ms
cumulative tree cost for ilp-cbc: 35474620331874
cumulative tree cost for ilp-highs: 35474620326221
cumulative dag cost for ilp-cbc: 118031
cumulative dag cost for ilp-highs: 118031
TrevorHansen commented 9 months ago

Nice work Alexander!

I don't know if now is the time to consider it, but I've made better cycle blocking function for the COIN-based ILP extractor: better cycle blocking

For a similar run time it'll yield a dag cost of around 77k rather than the current 118k - much closer to optimal (it does need a timeout though because some extractions take >10 hours).

@mwillsey do you want to review this as is, or do you want to see the better cycle blocking for the highs ilp solver at the same time?

averyanalex commented 9 months ago

Also I think there is no reason to leave an implementation that uses cbc directly, and if you want to keep cbc, then use good_lp for this. Then the implementation will be the same for cbc and highs.

So, maybe "merge" implementations, and cbc extractor will use good_lp?

TrevorHansen commented 9 months ago

Yes. Sounds good to me, too.

The only caveat is that for a separate high-performance ILP extractor we'll need stuff like the ability to read the last feasible solution when it times-out, which means that we might need to use the solver directly.

mwillsey commented 8 months ago

I think I had similar concerns last time I used good_lp (for the original ILP extractor in egg way back when). I couldn't figure out how to get timeouts and partial solutions to work. Does it have support for those things now?

averyanalex commented 8 months ago

I think I had similar concerns last time I used good_lp (for the original ILP extractor in egg way back when). I couldn't figure out how to get timeouts and partial solutions to work. Does it have support for those things now?

good_lp + highs supports timeouts, but I haven't figured out yet if it's possible to get the latest solution.

mwillsey commented 8 months ago

Okay, I think this PR will be gated on figuring that out then. A timeout isn't super useful in this scenario unless you can get a "best-so-far" solution out.

mwillsey commented 8 months ago

I'm converting this to a draft until that's figured out.

TrevorHansen commented 8 months ago

Okay, I think this PR will be gated on figuring that out then. A timeout isn't super useful in this scenario unless you can get a "best-so-far" solution out.

I was expecting the best-so-far to be super helpful and have implemented it in one of the ILP extractors. Surprisingly, I couldn't get it to help much. I've come around now, and think that it's possible that having the solver work 1/3 faster (which HiGHs seem to do) is going to exceed the advantage of being able to read the best-so-far solution. I'll get some data.

averyanalex commented 2 months ago

Okay, I think this PR will be gated on figuring that out then. A timeout isn't super useful in this scenario unless you can get a "best-so-far" solution out.

I was expecting the best-so-far to be super helpful and have implemented it in one of the ILP extractors. Surprisingly, I couldn't get it to help much. I've come around now, and think that it's possible that having the solver work 1/3 faster (which HiGHs seem to do) is going to exceed the advantage of being able to read the best-so-far solution. I'll get some data.

Any updates on this? Are we really need best-so-far solutions?