Closed MrChico closed 4 years ago
You're right that SBV simply uses an Integer
to represent both unbounded integers and bit-vectors, regardless of their size or signedness.
The CVal
type is where you want to look at. You can add a new constructor there, i.e.:
data CVal = CAlgReal !AlgReal -- ^ Algebraic real
| CInteger !Integer -- ^ Bit-vector/unbounded integer
| CFloat !Float -- ^ Float
| CDouble !Double -- ^ Double
| CChar !Char -- ^ Character
| CString !String -- ^ String
| CList ![CVal] -- ^ List
| CSet !(RCSet CVal) -- ^ Set. Can be regular or complemented.
| CUserSort !(Maybe Int, String) -- ^ Value of an uninterpreted/user kind. The Maybe Int shows index position for enumerations
| CTuple ![CVal] -- ^ Tuple
| CMaybe !(Maybe CVal) -- ^ Maybe
| CEither !(Either CVal CVal) -- ^ Disjoint union
would become:
data CVal = CAlgReal !AlgReal -- ^ Algebraic real
| CWord !Word -- ^ Bit-vector
| CInteger !Integer -- ^ Unbounded integer
| CFloat !Float -- ^ Float
| CDouble !Double -- ^ Double
| CChar !Char -- ^ Character
| CString !String -- ^ String
| CList ![CVal] -- ^ List
| CSet !(RCSet CVal) -- ^ Set. Can be regular or complemented.
| CUserSort !(Maybe Int, String) -- ^ Value of an uninterpreted/user kind. The Maybe Int shows index position for enumerations
| CTuple ![CVal] -- ^ Tuple
| CMaybe !(Maybe CVal) -- ^ Maybe
| CEither !(Either CVal CVal) -- ^ Disjoint union
Assuming this Word
type you're talking about can handle arbitrary-size words somehow. This wouldn't really be too hard to implement, except you'd need to touch a lot of code.
I'd say give it a shot. My expectation is that Integer
is already quite "optimized" and it would be hard to beat its performance. So, profile as you go to see if it'll actually save you anything.
I wonder if adding the following line to normCV
would do anything:
normCV :: CV -> CV
normCV c@(CV (KBounded signed sz) (CInteger v)) = c { cvVal = CInteger norm }
where norm | sz == 0 = 0
| signed = let rg = 2 ^ (sz - 1)
in case divMod v rg of
(a, b) | even a -> b
(_, b) -> b - rg
| v < 2^sz = v -- New line to avoid modulus
| True = v `mod` (2 ^ sz)
normCV c@(CV KBool (CInteger v)) = c { cvVal = CInteger (v .&. 1) }
normCV c = c
also would be good to cache 2^sz
there; I'd expect that would run quickly on any modern architecture, but you never know.
I'd start with playing with normCV
to see if you can make it go faster. (Caching 2^sz
and avoiding mod
seems to be the obvious things to do.) This might be a cheaper thing to do and help out with performance.
If that doesn't get you anywhere, then look into adding a new CVal
constructor. That'd be how I'd approach this.
@MrChico
Here's another thing to try. In general:
x `mod` (2 ^ sz)
is equivalent to:
x .&. (((1 :: Integer) `shiftL` sz) - 1)
at least when x
is non-negative. (Negative case might work too, actually, need to do some thinking.)
So, I'm wondering if replacing the modulus in the last case with the above expression might recover some of the performance for you.
Can you give it a try and see what you find out?
@doyougnu Would it be possible for you to benchmark the above change and see if it gains us anything?
Another cheap thing to do: Add an INLINE
pragma to normCV
. We should do that regardless, I think it'd help in all cases at the expense of minor(?) increase in code size.
I have experimented with your suggestions @LeventErkok , and here are some things I found:
When adding the case | v < 2^sz = v
to normCV
, I got a failure:
Exception: Numeric.showIntAtBase: applied to negative number -1
when running some tests. This doesn't seem to have been caught by the sbv test suite, but I did not investigate further what goes wrong here.
Unfortunately, I did not see any performance increase on the measurements I'm using when inlining normCV
. This was when benchmarking my program which relies on sbv.
Which benchmarks can I use to test the performance of sbv directly?
Ah, quite right. That line by itself wouldn't work. The correct one would be:
| v >= 0 && v < 2^sz = v
But then, I'm not sure if you're really saving anything by that; probably won't impact it.
We should really do the x .&. (((1 :: Integer)
shiftLsz) - 1)
trick, but I need to check if that works when x
is negative. Do you know if it does?
I'm surprised the INLINE
directive didn't do anything. Are you compiling with at least -O2
? That might be crucial.
For benchmarking, I'll defer to @doyougnu as he's the expert on all matters regarding benchmarking SBV.
@MrChico I just pushed https://github.com/LeventErkok/sbv/commit/dadc7195b34c7e4a65be1c32a05b883857277ced, that I hope should make things a little faster. (At least not slow them down.)
Try compiling with -O2
to make sure the inlining actually does kick in. Do you see any improvements?
@LeventErkok you are right, it turns out I was not compiling with O2. Doing so does seem to yield a big difference!
Here's a summary of six runs of a concrete calculation with a lot of Word256
arithmetic with 3c71a10b
:
records:6
sum:36.51
avg:6.085
std:0.0965660395790871
med:6.095max:6.22
min:5.92
and here is another six runs, using dadc7195
instead:
records:6
sum:34.23
avg:5.705
std:0.104363148029616
med:5.74max:5.79
min:5.48
as far as low hanging fruit goes, I'd say that's a win!
We should really do the
x .&. (((1 :: Integer) shiftL sz) - 1)
trick, but I need to check if that works when x is negative. Do you know if it does?
I believe that does work. We are dealing with integers represented by two's complement here, right? Then the masking should preserve the lower bits, and the compliment will be translated to the new base
Cool. 5% is nice for a one-line change. But further inspection and benchmarking/profiling can perhaps identify further low-hanging fruit. Please do report what you find out!
I have experimented with your suggestions @LeventErkok , and here are some things I found:
When adding the case
| v < 2^sz = v
tonormCV
, I got a failure:Exception: Numeric.showIntAtBase: applied to negative number -1
when running some tests. This doesn't seem to have been caught by the sbv test suite, but I did not investigate further what goes wrong here.Unfortunately, I did not see any performance increase on the measurements I'm using when inlining
normCV
. This was when benchmarking my program which relies on sbv. Which benchmarks can I use to test the performance of sbv directly?
@MrChico you can use the SBVBenchmark
cabal target to benchmark sbv directly. I have some description for the intended work flow in SBVBenchSuite/SBVBenchmark.hs
but essentially you would run cabal v2-run SBVBenchmark
or cabal bench
to generate a set of benchmarks which will be put in SBVBenchSuite/BenchResults
. Then you would make your changes and run another benchmark to generate another file. I've been using the bench-show
package to compare the benchmarks, but this package expects the benchmarks to be in the same file so you'll have to append the first file with the runs from the second file. Then you can call cabal v2-repl SBVBenchmark
and run SBVBenchSuite/Utils/SBVBenchFramework.compareBenchMarksCLI <your-file-here>
to compare the benchmarks.
FYI I just pushed some changed to the benchmark suite to improve the user interface. You should now be able to do the following:
sbv
cabal v2-run SBVBenchmark
, this will run the benchsuite and generate a file in SBVBenchSuite/BenchResults
cabal v2-run SBVBenchmark
, now you have two filescabal v2-repl SBVBenchmark
> compareBenchmarks old-file new-file
for example:
*Utils.SBVBenchFramework> compareBenchmarks "SBVBenchSuite/BenchResults/2020-07-27-17:04:04.261168259-UTC.csv" "SBVBenchSuite/BenchResults/2020-07-27-17:23:53.048061816-UTC.csv"
(Mean)
Benchmark default(0)(μs) default(1) - default(0)(%)
---------------------------------------------------------- -------------- --------------------------
Puzzles//NQueens.NQueens 5 57403.71 -59.83
Miscellaneous//SubsetEquality.Transitivity 32854.40 -57.71
Puzzles//NQueens.NQueens 1 10608.36 -52.75
Puzzles//NQueens.NQueens 3 13544.35 -44.09
Miscellaneous//SubsetEquality 27863.06 -27.70
Miscellaneous//Tuple 176469.91 -24.81
Puzzles//NQueens.NQueens 2 9359.86 -24.80
Uninterpreted//test 28321.36 -23.95
Puzzles//NQueens.NQueens 4 20939.26 -21.48
Miscellaneous//DeMorgans.Cap 25356.26 -17.93
Uninterpreted//SArray 8457.12 -16.99
Optimizations/VM.allocate 22352.01 -15.79
Puzzles//LadyAndTigers 27418.50 -15.06
Miscellaneous//InclusionIsPO 23241.23 -13.55
Puzzles//Puzzles.SendMoreMoney 48744.73 -12.45
...
I wonder if https://github.com/clash-lang/clash-compiler/pull/1132 can provide some ideas?
@gergoerdi Definitely worth giving a try. I suspect clash doesn't need unbounded integers, so they can get-away with naturals. Unfortunately, SBV does need unbounded signed integers, so we'd have to separate the two cases. This is more work, but might pay of for programs that make extensive use of bit-vector arithmetic. PR's and experiments are most welcome!
I'm closing this issue, as it is really hard to imagine what can be done without an explicit performance test case. Using different concrete types is always a fine idea. But fundamentally SBV is all about symbolic computation, and it has never been intended to run things "quickly" when everything is concrete.
I'm closing this ticket as there's no real actionable item for the time being. Feel free to reopen if there's a convincing use case that can benefit from a rewrite in a significant way.
Performance seems to be a recent theme in these issues, so it looks like I'll stay on track.
I did a quick profiling of my haskell application which now integrates sbv to implement symbolic execution, and found that the function
normCV
seems to be a significant cost centre when executing on concrete inputs.Looking at the sbv code a bit, it seems that this might be due to the fact that bitvectors are modeled using
Integer
, where operations need to be refitted into the proper size, instead of utilizing nativeWord
operations, such as those provided byData.Word
orData.DoubleWord
(https://hackage.haskell.org/package/data-dword). My library deals with quite large words, and it seems performance is suffering a bit due to the constant Integer normalization.Of course, I can quite easily solve this for my specific use case by introducing a new type
CSWord n = Symbolic (SWord n) | Concrete (Word n)
and treat the concrete and symbolic case differently, but this is something I kind of expectedsbv
to already take care of, so perhaps it would be of use to others as well to deal with this here?I'm wondering if it might be a good idea to add
Word
(parametric in size?) as another possible value ofCVal
, to increase performance when dealing with concrete words? I can possibly help with implementation if I can get some pointers on how one would go about this.