coercion of the user-supplied functions for speed?

imz commented 9 years ago

Would it make sense to use coercion of the user-supplied transforming functions in the implementation of ala for speed?

One thing worries me: I had the impression that newtype-generics works both for newtypes and the similar data F a = F a. And coercion is applicable only to real newtypes. Can the generics (or some other) machinery distinguish between them?

imz commented 9 years ago

Or perhaps the coercions are simply an alternative to this package. They are similarly constrained by a class, but rather not Newtype, but a one for coercible data...

jcristovao commented 9 years ago

Yes, they probably are... to be honest I don't have much time now to devote to investigate this, but one possible option would have something that, depending on the GHC version, would either use this generics base solution or coerce (which appeared in GHC 7.8, I believe).

One thing worries me:(...) Can the generics (or some other) machinery distinguish between them?

Yes, you are right! And to be honest, I don't know the answer to that question, I think it may not be possible.

Feel free to submit a PR noticing the limitation of the package on the documentation part - or a fix, if you find one.

Thanks

sjakobi commented 7 years ago

I have created a small benchmark that compares several variations of summing a list, including via ala and coercions (results below). It turns out that with GHC-8.2.1 they perform all very similarly. However with GHC-8.0, ala has a huge overhead.

Given that the situation seems to be basically fixed by the GHC developers I probably won't try using coercions. If anyone's interested in improving the situation with GHC-8.0 however I'll gladly accept PRs or other contributions.

GHC-8.2.1

benchmarking [1..5 :: Int]/foldMap/ala MySumDerive
time                 39.56 ns   (39.51 ns .. 39.64 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 39.42 ns   (39.37 ns .. 39.51 ns)
std dev              214.5 ps   (131.4 ps .. 367.7 ps)

benchmarking [1..5 :: Int]/foldMap/ala MySumManual
time                 39.58 ns   (39.53 ns .. 39.63 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 39.39 ns   (39.36 ns .. 39.43 ns)
std dev              122.1 ps   (94.35 ps .. 161.9 ps)

benchmarking [1..5 :: Int]/foldMap/manual wrap & unwrap
time                 39.98 ns   (39.87 ns .. 40.09 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 39.78 ns   (39.73 ns .. 39.86 ns)
std dev              215.2 ps   (161.1 ps .. 279.1 ps)

benchmarking [1..5 :: Int]/foldMap/coerce
time                 45.85 ns   (45.81 ns .. 45.88 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 45.70 ns   (45.66 ns .. 45.74 ns)
std dev              149.3 ps   (123.5 ps .. 187.2 ps)

benchmarking [1..5 :: Int]/coerce . mconcat . coerce
time                 44.38 ns   (44.36 ns .. 44.40 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 44.21 ns   (44.17 ns .. 44.24 ns)
std dev              111.8 ps   (94.13 ps .. 139.0 ps)

benchmarking [1..5 :: Int]/Prelude.sum
time                 35.85 ns   (35.65 ns .. 36.33 ns)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 35.63 ns   (35.52 ns .. 36.01 ns)
variance introduced by outliers: 23% (moderately inflated)

GHC-8.0

benchmarking [1..5 :: Int]/foldMap/ala MySumDerive
time                 163.0 ns   (162.2 ns .. 164.1 ns)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 164.0 ns   (163.0 ns .. 167.3 ns)
std dev              5.884 ns   (2.608 ns .. 11.12 ns)
variance introduced by outliers: 54% (severely inflated)

benchmarking [1..5 :: Int]/foldMap/ala MySumManual
time                 144.9 ns   (144.8 ns .. 145.0 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 145.1 ns   (144.9 ns .. 145.7 ns)
std dev              977.3 ps   (144.2 ps .. 2.052 ns)

benchmarking [1..5 :: Int]/foldMap/manual wrap & unwrap
time                 43.57 ns   (43.55 ns .. 43.60 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 43.58 ns   (43.56 ns .. 43.62 ns)
std dev              89.46 ps   (44.23 ps .. 136.6 ps)

benchmarking [1..5 :: Int]/foldMap/coerce
time                 43.63 ns   (43.60 ns .. 43.66 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 43.62 ns   (43.60 ns .. 43.65 ns)
std dev              84.61 ps   (59.96 ps .. 123.8 ps)

benchmarking [1..5 :: Int]/coerce . mconcat . coerce
time                 45.69 ns   (45.69 ns .. 45.70 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 45.70 ns   (45.69 ns .. 45.70 ns)
std dev              23.77 ps   (17.36 ps .. 32.68 ps)

benchmarking [1..5 :: Int]/Prelude.sum
time                 36.95 ns   (36.93 ns .. 36.97 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 36.95 ns   (36.93 ns .. 36.97 ns)
std dev              61.16 ps   (39.69 ps .. 99.54 ps)

jcristovao / newtype-generics

coercion of the user-supplied functions for speed? #3

GHC-8.2.1

GHC-8.0