Closed gksato closed 3 months ago
Thanks!
I'll look into this tomorrow but at a glance nextPermutationByLt
miss INLINE
pragma which means it won't be specialized. Is this intentional?
I just aligned it with the original implementation; I assumed it was intentional. Is it better to add it before your review?
I just aligned it with the original implementation
I see. I missed that somehow.
Is it better to add it before your review?
No need. I'll benchmark it and going to tweak code a bit anyway
Yes. This is an optimization. I used very simple benchmark
permute :: Int -> Int -> IO (Vector Int)
permute sz n_it = do
vec <- MV.generate sz id
replicateM_ n_it $ MV.nextPermutation vec
V.unsafeFreeze vec
With vector size 20 and n_it
varying in range 1000-10000. See Shimuuar/vector branch bench-next-permutation
for very quick and dirty code.
I've run five benchmarks:
[00.Base]
Baseline from current master[01.strict]
Added strictness. First change from #497 [02.stToPrim]
Added stToPrim[03.INLINE]
Added INLINE pragma[04.gksato]
Version from this PRSo for unboxed vector of ints stToPrim
gives ~4× runtime speedup and your optimization another 3×. Of course inlined and specialized version beats them all.
Do you plan to include functions proposed in #499 in this PR?
Do you plan to include functions proposed in https://github.com/haskell/vector/issues/499 in this PR?
I was not sure if it was appropriate to put optimization and API addition in one PR, but I'm happy to do it if you propose so! Should CHANGELOG
be added now?
Since I forgot to tweak the comments in Data.Vector.*.Mutable
in 4ac750f, I'm going to force-push an updated history.
I added INLINE
to the internal routine nextPermutationByLt
, because cabal v2-bench
with a line of bench "permute" $ whnfIO 20 useSize
proved it 10x faster with INLINE
(I haven't tweaked vector-bench-papi
, because I'm on MacOS).
Oh, and I added:
Data.Vector.*.Mutable.nextPermutationBy :: Constraint => (e -> e -> Ordering) -> v (PrimState m) a -> m Bool
Data.Vector.*.Mutable.prevPermutation :: (Constraint, Ord e) => v (PrimState m) a -> m Bool
Data.Vector.*.Mutable.prevPermutationBy :: Constraint => (e -> e -> Ordering) -> v (PrimState m) a -> m Bool
The TODOs we have now are... tests, benchmarks, and changelog? Are they all?
I've written some tests on a separate branch (gksato:optimize-nextperm-tidyup), since I don't know if it is wanted. I'd fast-forward the branch (gksato:optimize-nextperm) of this pull request to include that commit whenever any of you request it!
I finally found time to review PR. Please add tests and this PR is good to go. It would be nice to add changelog entry as well and whether you want to add benchmarks or not is up to you.
@Shimuuar OK, I'll do that soon. I've been writing some benchmark, so I might add it if implementation doesn't take much time. For the purpose of the changelog and @since
pragma, what should be the version? I assume 0.13.2.0
, but I'd like to be sure that it'll not be 0.13.1.1
?
Yes 0.13.2.0
. Adding new definitions to module requires at least minor bump according to PVP. Patch releases are for bugfixes and documentation
Added tests, a changelog entry, @since
annotations, and benchmarks. Force-pushed some commit message rewording because @since
without quotes refers to someone's name on a GitHub!
Note that vector-bench-papi
hasn't been updated, because they say it doesn't go nicely on MacOS (I haven't tried it myself). However, I assume that applying the same diff as vector/benchmarks/Main.hs
to vector-bench-papi/benchmarks/Main.hs
should work.
That's quite formidable battery of tests all right! It's more thorough than rest of benchmark suite :)
I'm unhappy with inlining of worker function. It's too large. But we're running into GHC's limitations since we don't way to ask GHC to generate specializations for it and its callers automatically.
Excellent pull request. Thank you!
Note that vector-bench-papi hasn't been updated
Yes ..-papi
package should modified same way. And that's the point when you start to want to use backpack in order to be able to swap bencmarking engine. I implemented poor man's version in math-functions. Sadly backpack never caught on
P.S. @since
. That poor guy. He must hate haskell.
I appreciate your work on review and merge, thank you!
Oh, yes. I haven't noticed this would of course be a nice use case for backpack. I imagined what if we try to comply DRY without backpack... urgh, it's a mess!
Sadly backpack never caught on
I truly agree with you.
I've noticed that the new version is on its way, which I'm glad to see. I appreciate your work. However, I would like to poke you about a small, minute detail:
Note that vector-bench-papi hasn't been updated
Yes
..-papi
package should modified same way.
Can we leave it as it is? I was just too lazy to build a Docker container to run vector-bench-papi
on my MacOS and I didn't modify it in this PR.
Thanks for reminding me. I totally forgot. I'll fix it after making release. It's fine since vector-bench-papi
is not published on hackage anyway
I see, I leave it to you! Thanks for your good maintaining work!
This implements some optimization of
nextPermutation
fromData.Vector.Generic.Mutable
, and supercedes #497. The main content of this re-implementation is the following two points:stToPrim
. This allows the compiler to optimize the code better.Also, this adds some clarification on doc comments;
nextPermutation
does not update the source vector when the given vector is the last permutation.Update
This also adds the following API: