Open twizmwazin opened 1 month ago
It's definitely a hold-over from Z3, but I would advise against changing Extract, as Z3 is the "industry standard" and that conceptual compatibility is important. Instead, what if we create a synonym to Extract, and have it make more sense?
On Fri, Sep 20, 2024 at 3:11 PM Kevin Phoenix @.***> wrote:
Description
The args to extract aren't very intuitive: (high (inclusive), low (inclusive), base). This seems like we copied it straight from z3, which did the same thing: https://z3prover.github.io/api/html/namespacez3py.html#a40e9429ef16134a6d9914ecdc2182e8c https://urldefense.com/v3/__https://z3prover.github.io/api/html/namespacez3py.html*a40e9429ef16134a6d9914ecdc2182e8c__;Iw!!IKRxdwAv5BmarQ!cXWDPouh_eUC_cgobFrABAUrCFHBr-DbLh7vf1D4Q7OxF1DSX6vbwlMHOIZpnq4NlZC4HFJssLTCV44kHKpUSHL4eLGY$
What's even better though is that z3 later just overloaded this function for Strings and Sequences, but when they did that they must have been in a clearer state of mind, as for that variant the args are completely different: (base, index, length). I think we should copy those args for BVs too.
I guess this is more intuitive if you are reading the little-endian form of the BV, ex Extract(7, 0, BVV(0xabcd. 16)) == BVV(0xcd, 8). Though even in this case it doesn't feel very consistent, and the logic in get_bytes reflects this. Steps to reproduce the bug
No response Environment
No response Additional context
No response
— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/angr/claripy/issues/502__;!!IKRxdwAv5BmarQ!cXWDPouh_eUC_cgobFrABAUrCFHBr-DbLh7vf1D4Q7OxF1DSX6vbwlMHOIZpnq4NlZC4HFJssLTCV44kHKpUSCu-Eu6w$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AA2LHF27TPUXMQI7O4HQ5ATZXSMPBAVCNFSM6AAAAABOS5R5DCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGUZTSNZVG4YTIMQ__;!!IKRxdwAv5BmarQ!cXWDPouh_eUC_cgobFrABAUrCFHBr-DbLh7vf1D4Q7OxF1DSX6vbwlMHOIZpnq4NlZC4HFJssLTCV44kHKpUSB_zk1XQ$ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Let's not change it as it's going to impact too much existing code, much of which is not written by us. The ship has sailed at this point.
I disagree with the general idea that it is too late to change something, holding on to tech debt because it is old leads to bad software that is hard to learn, hard to maintain, makes usage generally unpleasant and eventually kills a project when something better comes along.
But I think this is easy enough to type check at runtime that we could maintain compatibility for a while and give people advanced notice. I could change Extract to either support (bv, offset, size) or the current args, issuing a deprecation warning for the current args. And then after a year or when the rust implementation is ready, we drop the compatibility then, ideally marked with a major version release and a migration guide.
FYI, the definition of Extract arguments comes from SMT-LIB2: https://stp.readthedocs.io/en/latest/smt-input-language.html
It would be nice to keep compatibility with the “industry standard.” Supporting two combinations of arguments is a good way forward; deprecating the current one is wrong IMO. Especially since it’s not “counter-intuitive” to me after working with z3 and other solvers for so many years.
Alright yeah if Z3 is also consistent with SMT-LIB2 I think I'm in agreement with you. I'll see about adding a second set of args and maintaining both.
Description
The args to extract aren't very intuitive: (high (inclusive), low (inclusive), base). This seems like we copied it straight from z3, which did the same thing: https://z3prover.github.io/api/html/namespacez3py.html#a40e9429ef16134a6d9914ecdc2182e8c
What's even better though is that z3 later just overloaded this function for Strings and Sequences, but when they did that they must have been in a clearer state of mind, as for that variant the args are completely different: (base, index, length). I think we should copy those args for BVs too.
I guess this is more intuitive if you are reading the little-endian form of the BV, ex
Extract(7, 0, BVV(0xabcd. 16)) == BVV(0xcd, 8)
. Though even in this case it doesn't feel very consistent, and the logic in get_bytes reflects this.Steps to reproduce the bug
No response
Environment
No response
Additional context
No response