Closed rogeriochaves closed 6 years ago
Hey @rogeriochaves
This looks pretty good, but I am bench marking it as being ~x8 slower: https://ellie-app.com/kqXGQNgMBa1/1
Im not sure if its worth the trade off, but also, to be honest I am not sure how to weigh the performance hit with the stack safety.
Here's an alternative, based on @rogeriochaves' implementation, that seems to outperform the current List.Extra
implementation and is stack-safe.
https://ellie-app.com/68qqkSvkta1/3
The benchmarks test lists of size 10, 100, 1000 and 2000 where each sublist has between 0 and 200 integers. Flipping the comments on the two mains will run a very basic fuzz test.
At @eeue56's suggestion; an attempt to explain why there are 3 functions involved, how they make this fast and how they make this stack-safe.
From a birds eye perspective; the implementation works somewhat like this:
By using an accumulator rather than building up the result directly, as was done in the original code, we can put the recursive call in the tail position; meaning that the final expression to be evaluated is a call to the same function. When the Elm compiler sees a case where recursion happens in the tail call, it can compile that into a while
loop. The resulting code doesn't actually recurse, which means the call-stack depth doesn't increase and the implementation is stack-safe.
reverse
?Appending to the end of a List
has 2 drawbacks:
By consing our accumulated results to the front of the accumulator, we only need 1 traversal: reversing the result at the end. Every iteration of the accumulator is a structurally shared list made up of a new head and the existing accumulator as the tail.
This saves both time (less traversals) and memory (structural sharing); which in turn saves more time (less garbage collection interference).
headsTails
function?The original code used List.filterMap head list
and List.filterMap tail list
. This is perfectly fine and perhaps even preferred from a code-reuse perspective, it also means doing 2 traversals where one would suffice. Furthermore, both head
and tail
return Nothing
under the same conditions - we can use a single pattern x::xs
to know it has both a head and a tail. In other words, this saves traversals (one rather than two), pattern-matching everything twice, and filtering the result of that call.
Has there been discussion about the proper way to handle non-rectalinear matrices? I dislike the solution of collapsing missing values so that the new list position doesn't reflect the position of the list an element came from.
[ [ 1, 2, 3 ]
, [4, 5]
, [6, 7, 8]
]
-- transposes to
[ [ 1, 4, 6 ]
, [ 2, 5, 7 ]
, [ 3, 8 ]
]
-- transposes back to
[ [ 1, 2, 3 ]
, [ 4, 5, 8 ] -- the 8 has moved up a row
, [ 6, 7 ]
]
We could wrap the whole thing in a Maybe
which fails on different-length lists.
We could wrap individual elements in Maybe
, which would be a bit awkward to deal with.
We could truncate the matrix to the shortest list size, which seems a bit surprising, but less surprising than the current functionality and it gives us the nicer type that wouldn't need to change.
I believe I have a tail recursive solution that has between 3x (for 10 elements) to 8x (for all other sizes) better performance than @zwilias. In regards to @gilbertkennen's comment, my implementation goes with the third approach, truncating to the shortest row length, but could be easily adapted to wrap the entire result in a Maybe
as per approach 1.
Here's the code:
transposePalmer : List (List a) -> List (List a)
transposePalmer listOfLists =
List.foldr (List.map2 (::)) (List.repeat (rowsLength listOfLists) []) listOfLists
rowsLength : List (List a) -> Int
rowsLength listOfLists = case listOfLists of
[] ->
0
x :: _ ->
List.length x
and the benchmark: https://ellie-app.com/pw7k2sR8ma1/1. Whipped this up rather quickly, so let me know if I'm missing something.
EDIT
I didn't really look at the benchmarking tests before. @gilbertkennen is correct that when comparing the performance only on rectilinear lists, the performance is about 1.5x better.
Also, I said above that the solution is tail-recursive. That's not the proper terminology though because List.foldr
and List.map2
are not even implemented recursively in Elm (they're both Native/Kernel functions). What I really meant is that the implementation is stack safe.
I thought about the possible methods of handling jagged lists and I think that wrapping the entire result in a Maybe
makes the most sense.
While this is an improvement, the test is a bit unfair comparing truncating vs. not when using very uneven row lengths. Benchmarks on equal-length rows gives us ~1.5x which is still very good.
Concerning List.foldr
being a Native/Kernel function in Elm: That is the case for the currently released version, but will not be much longer (assuming Elm 0.19 is released). See https://github.com/elm-lang/core/blob/master/src/List.elm.
So that part of the benchmarking here is really against a moving target. Who knows whether with List.foldr
as implemented in the next release of the core
library, the 1.5x improvement will stand, or whether for the specific use of List.foldr
above, the non-native implementation makes things worse.
For what it's worth; the implementation currently in the dev
tree of elm-lang/core
is pretty fast, with a bit of a drop after 2k elements (due to falling back to foldl
+ reverse
). Once there is an 0.19 alpha, though, I'm hopeful that we can start checking how the codegen and resulting size turns out for the foldr
and map
implementations here: https://github.com/zwilias/elm-faster-map
Your point still stands, of course, but figured I'd add to it.
Even with a naive implementation of foldr, it's still ~1,4x faster than Ilias's and ~4x faster than list-extra (both using core code for any folds they use).
Why "Even"? I assume that naive implementation is not stack safe? A stack safe version might be slower. So it's not like a non-naive version is definitely faster than the naive version, so that if "even" the naive version of foldr makes another function fast we could simply assume that any non-naive version of foldr would make the other function still faster.
(I don't remember what is the case for the foldr expected to be shipped with Elm 0.19: whether it is "just" stack safe, but possibly slower than a naive, non stack safe version, or is both stack safe and faster than the naive implementation.)
This naive foldr should be stack safe. It reverses the list and runs a TCO foldl.
Ah, that's not what I would call "naive". :smile:
I have been informed that I clinked to the wrong code previously that didn't include the foldr implementation.
https://ellie-app.com/bQBtPcMvwa1/0
Perhaps that will make things clearer.
@pzp1997 your method fails on this test:
test "short rows are skipped" <|
\() ->
Expect.equal
(transposePalmer [ [ 10, 11 ], [ 20 ], [], [ 30, 31, 32 ] ])
[ [ 10, 20, 30 ], [ 11, 31 ], [ 32 ] ]
Instead it returns [], in other words, truncating by the smaller one (see https://ellie-app.com/ntkpKfbFCa1/0)
This would be @gilbertkennen 's third suggestion right, and I'm fine with that (personally, I never needed to transpose an uneven matrix)
Is this the behavior we want from now on?
Last I recall, there was no consensus on how the function should behave on non-rectangular inputs. I opened #86 so that we can continue discussing that matter.
With 10000 elements the transpose function start giving
Maximum call stack size exceeded
https://ellie-app.com/8qhGRQbVQa1/0