haskell / vector

An efficient implementation of Int-indexed arrays (both mutable and immutable), with a powerful loop optimisation framework .
Other
362 stars 140 forks source link

unboxed Vector Bool seems to be 8 times slower than unboxed Array Int Bool #495

Open GeorgeCo opened 1 week ago

GeorgeCo commented 1 week ago

unboxed Vector Bool seems to be 8 times slower than unboxed Array Int Bool. As far as I can tell it is due to the performance difference in accessing array/vector elements. The following is with ghc 9.8.2 on an Intel Mac but I get about the same results on an M2 MacBook with ghc 9.10.1

diff prob171a.hs prob171v.hs
1d0
< 
8c7
< import qualified Data.Array.Unboxed as AU
--
> import qualified Data.Vector.Unboxed as VU
11a11
> 
13,14c13,14
< psqs :: Int -> AU.Array Int Bool
< psqs n = AU.listArray (0,n) (map (psq . ssd) [0..n])
--
> psqs :: Int -> VU.Vector Bool
> psqs n = VU.fromList (map (psq . ssd) [0..n])
29a30,33
> 
> 
>              
> 
36c40
<       | bv AU.! ssd i = f (i + 1) (res + i)
--
>       | bv VU.! ssd i = f (i + 1) (res + i)
gcolpitts@Georges-Mini haskell % ghc -O prob171a.hs
ghc -O prob171a.hs
Loaded package environment from /Users/gcolpitts/.ghc/x86_64-darwin-9.8.2/environments/default
[1 of 2] Compiling Main             ( prob171a.hs, prob171a.o ) [Source file changed]
[2 of 2] Linking prob171a [Objects changed]
ld: warning: ignoring duplicate libraries: '-lm'
ld: warning: search path '/opt/local/lib/' not found
gcolpitts@Georges-Mini haskell % ghc -O prob171v.hs
ghc -O prob171v.hs
Loaded package environment from /Users/gcolpitts/.ghc/x86_64-darwin-9.8.2/environments/default
[1 of 2] Compiling Main             ( prob171v.hs, prob171v.o ) [Source file changed]
[2 of 2] Linking prob171v [Objects changed]
ld: warning: ignoring duplicate libraries: '-lm'
ld: warning: search path '/opt/local/lib/' not found
gcolpitts@Georges-Mini haskell % ./prob171a +RTS -s
./prob171a +RTS -s
367437474103403
         103,112 bytes allocated in the heap
           3,272 bytes copied during GC
          44,328 bytes maximum residency (1 sample(s))
          25,304 bytes maximum slop
               6 MiB total memory in use (0 MiB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0         0 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
  Gen  1         1 colls,     0 par    0.000s   0.000s     0.0002s    0.0002s

  INIT    time    0.002s  (  0.002s elapsed)
  MUT     time    8.442s  (  8.444s elapsed)
  GC      time    0.000s  (  0.000s elapsed)
  EXIT    time    0.000s  (  0.010s elapsed)
  Total   time    8.444s  (  8.457s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    12,214 bytes per MUT second

  Productivity 100.0% of total user, 99.9% of total elapsed

gcolpitts@Georges-Mini haskell % ./prob171v +RTS -s
./prob171v +RTS -s
367437474103403
          97,888 bytes allocated in the heap
           3,272 bytes copied during GC
          44,328 bytes maximum residency (1 sample(s))
          25,304 bytes maximum slop
               6 MiB total memory in use (0 MiB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0         0 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
  Gen  1         1 colls,     0 par    0.000s   0.000s     0.0002s    0.0002s

  INIT    time    0.003s  (  0.003s elapsed)
  MUT     time   61.850s  ( 62.264s elapsed)
  GC      time    0.000s  (  0.000s elapsed)
  EXIT    time    0.000s  (  0.012s elapsed)
  Total   time   61.854s  ( 62.280s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    1,582 bytes per MUT second

  Productivity 100.0% of total user, 100.0% of total elapsed

gcolpitts@Georges-Mini haskell % 

prob171v.hs.txt prob171a.hs.txt

konsumlamm commented 1 week ago

The reason is that bv is inlined in the vector version, causing a new vector to be allocated each time bv is used. This can be fixed by adding a {-# NOINLINE bv #-} annotation.

Since you're using Vector Bool as bit vectors, take a look at https://hackage.haskell.org/package/bitvec. It provides both faster and more memory efficient bit vectors.

GeorgeCo commented 1 week ago

Thanks for the quick and informative response! It certainly explains the performance difference but I'm surprised that it doesn't show up in bytes allocated. In any case I think we're agreed that this is a bug. IMHO and my less than expert knowledge it seems like a serious one. Thanks again!

GeorgeCo commented 1 week ago

I verified that, as you said , {-# NOINLINE bv #-} is a workaround for the problem

konsumlamm commented 6 days ago

I don't think this is a bug in vector. It could be a GHC bug though.

GeorgeCo commented 6 days ago

Thanks! Any ideas why the bug only happens with Vector and not with Array?