elm-community / list-extra

Convenience functions for working with List.
http://package.elm-lang.org/packages/elm-community/list-extra/latest
MIT License
135 stars 58 forks source link

Proposal for `windows` function(s) #139

Open HarrisonMc555 opened 4 years ago

HarrisonMc555 commented 4 years ago

A task I occasionally reach for is the ability to iterate through a list in a sliding window to get successive pairs of neighboring elements.

Would we consider adding a windows function?

windows : List a -> List ( a, a )
windows l =
    case l of
        x :: y :: rest ->
            ( x, y ) :: windows2 (y :: rest)

        _ ->
            []

Since it may be desirable to have windows of larger sizes, we could name this function windows2 and add functions windows3, windows4, etc.

This is similar to Ruby's each_slice function or Rust's windows, but returning tuples instead of lists.

P.S. Feel free to include feedback on code style, I'm still fairly new to Elm.

prikhi commented 4 years ago

Tuples are limited to 2 or 3 elements so at most we'd have windows2 and windows3. Here's another implementation but I haven't evaluated either from a performance perspective:

$ elm repl
> import List.Extra as List
> windows l = List.tail l |> Maybe.map (List.zip l) |> Maybe.withDefault []
<function> : List a -> List ( a, a )
> windows (List.range 1 10)
[(1,2),(2,3),(3,4),(4,5),(5,6),(6,7),(7,8),(8,9),(9,10)]
> windows (List.range 1 11)
[(1,2),(2,3),(3,4),(4,5),(5,6),(6,7),(7,8),(8,9),(9,10),(10,11)]
> windows [1]
[]

Of course, I will defer to @Chadtech if it's worth including in this repo.

pzp1997 commented 4 years ago

I was also playing around with alternate implementations a few days ago but forgot to comment about it. I haven't done any benchmarking but my intuition is that this implementation will be pretty fast (and is stack safe).

windows2 : List a -> List ( a, a )
windows2 list =
    case list of
        _ :: xs ->
            List.map2 Tuple.pair list xs

        _ ->
            []

Another implementation I liked, mainly for its sleekness, is

windows2 : List a -> List ( a, a )
windows2 list =
    List.map2 Tuple.pair
        list
        (List.drop 1 list)

Both of these implementations can be extended to implement windows3.

windows3 : List a -> List ( a, a, a )
windows3 list =
    case list of
        _ :: ((_ :: ys) as xs) ->
            List.map3 (\x y z -> ( x, y, z )) list xs ys

        _ ->
            []
windows3 : List a -> List ( a, a, a )
windows3 list =
    List.map3 (\x y z -> ( x, y, z ))
        list
        (List.drop 1 list)
        (List.drop 2 list)

As @prikhi mentioned, we cannot go beyond that since tuples can only contain 2 or 3 elements. There is a trick we could use to push past that limitation though. We can generalize Tuple.pair and (\x y z -> ( x, y, z )) to arbitrary functions of type a -> ... -> a -> b (where the actual number of a's corresponds to the N in windowsN) and make the return type a List b. For instance, windows4 might look like

windows4 : (a -> a -> a -> a -> b) -> List a -> List b
windows4 f list =
    case list of
        _ :: ((_ :: ((_ :: zs) as ys)) as xs) ->
            List.map4 f list xs ys zs

        _ ->
            []

or

windows4 : (a -> a -> a -> a -> b) -> List a -> List b
windows4 f list =
    List.map4 f
        list
        (List.drop 1 list)
        (List.drop 2 list)
        (List.drop 3 list)

Should we introduce windows4 and beyond, it would probably make sense to generalize windows2 and windows3 as well for consistency. That being said, I'm wondering if there's a common use case that would necessitate anything more than windows3.

Chadtech commented 4 years ago

Seems like a good function. I know I have used this function before. I think it would be a good idea to include it.

HarrisonMc555 commented 4 years ago

Thanks for the discussion, everyone!

Replies to Questions

To answer @Chadtech's questions:

What uses cases do you all have for this function? ...

My use case was for interpolating between adjacent elements. This way, I was turning a list of n elements into a list of n + (n-1)*m elements, where m was the number of elements between adjacent pairs. I was interpolating between the adjacent elements in the original list to come up with values for the new list.

To be more specific, I wanted to create a discrete color gradient with tiles of colors. I had specific colors in a grid but wanted to add several rows and columns between the original colors. I did some simple math to interpolate a gradient between the colors in the original list to achieve a "blend" from one color to the next. In order to do that, I needed a windows function to get adjacent pairs of colors.

I am guessing that windows of larger sizes are less common and their usefulness is less plausible...

I would agree. I think the most common use case is adjacent elements.

Is the name "window" like an analogy of a window being a small section of a bigger view...?

My understanding was that it was a "sliding window" across the original list. So yes, it was a different "view" into the original list. The term "sliding window algorithm" is an identical concept, although apparently generic over the size of the window. The term "sliding window" in the "sliding window protcol" is a somewhat similar use of the word "window", although admittedly not exactly identical.

Additional Discussions

One thing I'm considering more after coming back to this issue is perhaps to have two different functions: a pairs and a windows. The pairs function would return tuples with adjacent elements, while windows would return a List (List a) that would be arbitrary-length "window" views into the original list. Here are possible implementations.

pairs : List a -> List ( a, a )
pairs list =
    case list of
        _ :: xs ->
            List.map2 Tuple.pair list xs

        _ ->
            []

windows : Int -> Lis a -> List (List a)
windows length list
    if List .length list < length
    then
        []

    else
        List.take length list :: Maybe.withDefault [] (Maybe.map (windows length) (List.tail list))

My style isn't very good for this version of the windows function but I hope that illustrates what function would do.

Any thoughts on having both? I think that the pairs function would be more common, and I think that the windows function would cover most cases for a windows3, windows4, etc.

gampleman commented 2 years ago

Another great use case are rolling averages.

Although I think a more flexible signature would be:

windows : { leading : Int, trailing : Int } -> (List a -> b -> b) -> b -> List a -> b

Since a lot of the cases include potentially heavy data lifting tasks, I think not materialising the list and using it as a fold makes a lot of sense. For instance, if want a 7 day rolling average on an hourly dataset of 1000 points, the intermediate list would have 168000 elements in it.

Having two ints (leading and trailing) is useful, since what you want to window around depends on the use case, and sometimes you want to center on the datapoint, other times you want to only use past data.