Closed Eh2406 closed 6 months ago
This is quite the list of changes. I've made a first read pass. Most of things seem ok, but I was very skeptic about what I read here 2ee7c0857. Implications (=>) is not equivalence (<=>). I'll have to re-read all that at a more reasonable hour.
I made such a long list of changes so that it would be easy to call out exactly where I had made an incorrect leap of logic. That does seems suspicious. I am comforted by the assertion passing, but that doesn't mean I can prove it's valid. ... Speaking of the assertion, I used cfg(test)
not cfg(debug_assertions)
, let's see if it still works with that fixed. Yes it did. Sorry for the force push.
I'm going to set the prop tests to keep running overnight, and see if they found anything by the morning.
So here's the way I'm thinking about it at the moment, there are four kinds of versions. A version can either be contained in accumulated_intersection.intersection(start_term)
(for space called accumulated
) or not, simultaneously the version can either be contained in incompat_term
or not.
term |
type 1 | type 2 | type 3 | type 4 | |
---|---|---|---|---|---|
accumulated |
True | True | False | False | |
incompat_term |
True | False | True | False | |
accumulated.intersection(incompat_term) |
True | False | False | False | |
accumulated.intersection(incompat_term) == accumulated (aka old ) |
True | False | True | True | True |
accumulated.intersection(incompat_term.negate()) |
False | True | False | False | |
accumulated.intersection(incompat_term.negate()) == empty (aka new ) |
True | False | True | True |
(Edit: Now that I have fixed my mistakes) It looks to me like it is correct.
Having run the prop tests overnight, they still haven't found anything.
Having run the prop tests overnight, they still haven't found anything
That's great!
I'm taking some family time for the end of year so probably won't have a more in-depth review of this until the new year. I'd like to review this more in depth, but don't want to be a big blocker if you have a lot more planned during this period.
Please enjoy your holidays with no stress from this project. I suspect you will not be the only one with limited time to look at this. (I'm very excited to see how much difference this makes on the non-synthetic cases, but I must be patient.)
I discussed this with a mathematician friend tonight. I believe the transformation is correct, and is easier to understand starting a little further back. (Utilizing the shorthand accumulated
for accumulated_intersection.intersection(start_term)
) The original formulation was accumulated.subset_of(incompat_term)
. This asks if there is an element of accumulated
that is not in incompat_term
. accumulated.intersection(incompat_term.negate())
directly computes all the versions in accumulated
that are not in incompat_term
.
Great work!
main: 1fd7ec388a230936fb7c63e68120065eb8474c76 branch: 201d0b7f63d1b74bb267f3e0a40ff86b30be8f41
tf-models-nightly main
real 0m2,533s user 0m2,226s sys 0m0,303s
tf-models-nightly branch
real 0m1,235s user 0m0,949s sys 0m0,292s
bio_embeddings main
real 1m42,222s user 1m40,462s sys 0m1,651s
bio_embeddings branch
real 0m32,804s user 0m30,528s sys 0m1,740s
I discussed this with a mathematician friend tonight. I believe the transformation is correct, and is easier to understand starting a little further back. (Utilizing the shorthand
accumulated
foraccumulated_intersection.intersection(start_term)
) The original formulation wasaccumulated.subset_of(incompat_term)
. This asks if there is an element ofaccumulated
that is not inincompat_term
.accumulated.intersection(incompat_term.negate())
directly computes all the versions inaccumulated
that are not inincompat_term
.
Thanks for that!
This is some "straightforward" re-factoring to the satisfier code to take advantage of #171. This is not my first time following this line of inquiry. In previous attempts I have only saved the final solution, and then not been confident that the final solution had the same semantics as the original. So this time I kept a separate commit for each transformation. I also kept the original code around with debug asserts to make sure they had the same behavior. Each commit passes all tests. I would strongly recommend reviewing this one commit at a time, and looking at the commit message for why I thought it was correct.
Some highlights: 1299c67 goes from 2 intersections to 1 for each decision searched in the history. 5127437 goes from a linear to a logarithmic scan. 78f8fb1 and the commits that follow are just things I noticed while staring at this code and can be left for follow-up PRs if their controversial.
As for performance the normal benchmarks think this is lost in the noise. Because they don't do much backtracking and intersection is fairly cheap. The new synthetic benchmark
slow_135_0_u16_NumberVersion
improves its runtime by 82%!