Open hajduakos opened 4 years ago
Gazer converts integer type (short, long, signed, unsigned, etc.) variables in the given C code to int in the generated CFA. Theta 1.7 handles these as large int literals, so the new functionality seems to work without any internal change in Gazer.
There is a nuance after 1<<63ull: it does not fit the range of int64_t (as it is used in IntLiteralExpr). So the only problem is constant unsigned long long values larger than that can fit in a signed long long.
I'd reopen this as a bug or create a new issue in Gazer, whichever is more appropriate :)
Reopening seems cleaner to me in this case.
Thanks @radl97 for pointing this out. What could you imagine as a solution? Replacing Gazer's internal representation with some bigint (e.g. from boost)? Furthermore, is this also an issue for the BMC backend?
LLVM has some solution, I think. For example APInteger maybe, which stores bitwidth with the value.
Note: I would skip Boost, as it is only used on the outer layers, mostly for filesystem handling.
Without giving much thought, I think BMC is affected when using Integer representations (instead of the default(?) bitvectors).
Disclaimer: Both my claims against BMC and Theta modules are untested, I've only seen how IntLitExpr stores integers, which led me to think the problem exists. Further investigation may show otherwise.
I gladly take up the quest, if you'd like :) (both verifying and solving the issue)
@radl97 if you feel like doing it, please go ahead. You can create a new branch and when you're done submit a PR. It would also be nice to come up with a simple test that fails currently, but will be fixed by the patch.
@radl97 Indeed, we currently store large int literals in int64_t
. I think we can handle this either with llvm::APSInt
(like APInt
, but also stores sign information) or boost::multiprecision
. Both solutions are fine by me.
Note that you probably will have to update the expression hashing mechanism for int literals in GazerContextImpl.h
, otherwise you might get a good number of undebuggable template errors :) The default implementation for primitive literals uses llvm::hash_value
, probably we will need a specialization for llvm::hash_value
and/or expr_hasher
(if deciding equilitiy is not trivial).
Previously Theta could only parse 32-bit integer literals, but with 1.7 (see #29) it should not be a limitation. We should check if Gazer has to be modified internally to be able to produce larger literals.