egraphs-good / extraction-gym

benchmarking e-graph extraction
MIT License
24 stars 15 forks source link

ILP cost larger than greedy cost #12

Closed yaohuicai closed 9 months ago

yaohuicai commented 9 months ago

Thank you for making this repo open-source. It is a really neat and helpful tool for egraph extraction research.

While executing the code, I've noticed an inconsistency in the results produced by different extraction methods. The cost outcomes derived from the ilp-cbc method consistently surpass those from both the greedy-dag and faster-greedy-dag methods. Theoretically, given that ILP aims for the optimal cost, it should always yield a result less than or equal to the greedy methods.

Specifically, here are some examples:

tensat/nasrnn_acyclic.json

{
    "name": "data/tensat/nasrnn_acyclic.json",
    "extractor": "greedy-dag",
    "tree": 1021.1320661730861,
    "dag": 1.0898100023314328,
    "micros": 3009220
}
{
    "name": "data/tensat/nasrnn_acyclic.json",
    "extractor": "ilp-cbc",
    "tree": 1697.1535464829212,
    "dag": 1.5102849980721658,
    "micros": 6817584
}

data/babble/list_list_hard_test_ellisk_2019-02-15T11.26.41--bench010_it15.json

{
    "name": "data/babble/list_list_hard_test_ellisk_2019-02-15T11.26.41--bench010_it15.json",
    "extractor": "greedy-dag",
    "tree": 915,
    "dag": 425,
    "micros": 13953
}
{
    "name": "data/babble/list_list_hard_test_ellisk_2019-02-15T11.26.41--bench010_it15.json",
    "extractor": "ilp-cbc",
    "tree": 1862,
    "dag": 813,
    "micros": 235790
}

Any insights or clarifications you can offer would be greatly appreciated.

mwillsey commented 9 months ago

The ILP solution included in this repo is not optimal because of cycle breaking, which removes (perhaps too many) e-nodes from the e-graph.

We should probably include an implementation of the optimal algorithm, although it will be considerably slower. The optimal algorithm forces the ILP solver to topologically sort the nodes to be extracted, which requires more variables and constraints.