Closed zwimer closed 1 year ago
Actually, this looks like an interaction with strings.py
's __init__
.
Note that here strings
edits kwargs
:
https://github.com/angr/claripy/blob/696f1c08c2996c5350867685462d1245070a996d/claripy/ast/strings.py#L31
Bits
' self.length = kwargs['length']
would be a no-op if kwargs
were not edited between Bits
__init__
and Base
__new__
.
Suggested solution:
kwargs['length'] *= 8
from String
's __init__
(really this is part of 1)self.length = kwargs['length']
from Bits
__init__
. If this is done without doing 1 & 2 then string support will breakThis can be triggered from within Base
's new function via the concrete backend constructing an object, it being returned from new, then edited after __init__
runs.
Note: fixing it might require refactoring Base
's __new__
method as well. Specifically, Base
can call the concrete backend internally to construct an object. But Backend.call
only takes in op
and args
. For some ops, length
is not stored in either. For example, a StringV
might have args ('12', 2)
, but length 80000
- such as the result of an simplified IntToStr
operation (you can find this is test_backend_smt.py
).
Since Backend.call
cannot accept args beyond op
and args
, the StringV
to be constructed will have the size of .args[1] * 8
(the bit-length of the python str
held inside the StringV
) rather than the StringV
length of .length
.
Suggested fix: Make StringV
and any other ops that fall into this category hold both lengths or only the .length
length. Alternatively call
could take in kwargs
, but that feels like it is inviting abuse.
Proposed fixes:
def StringV
in ast/strings.py
, change the following line from/to: https://github.com/angr/claripy/blob/acc6e380f22fdd74af6c4839cbf8fa738af2f0fc/claripy/ast/strings.py#L154
result = String("StringV", (value, length), length=8*length, **kwargs)
String.__init__
from changing kwargs['length']
.Bits.__init__
from changing self.length
.How this works:
String
constructor now takes in the byte length as arg[1]
, which means things like the Backend.call
can use the proper length rather than the length of the python string with implicit trailing zeros ignored. Additionally, the kwargs['length']
passed to Base.__new__
will be in bits from the start.Base.__new__
and hashedWhat this will change:
my_string.length
will be in bits; for bytes, use my_string.str_length
or my_string.args[1]
(both should be byte length)
Additionally, StringS
must also 8*
length.
When creating a
Bits
AST object, the__init__
function setsself.length
; in some cases this does not change anything, but in some cases the newly set length is different from the old length.This occurs after
Base
's__new__
runs, and thus it might change the internals of a base, afterBase
calculates the hash of the object to store it inside the lookup cache.Note that
.length
is aHASHCONS
variable, it affects the hash.