rsquaredacademy / rfm

Tools for Customer Segmentation using RFM Analysis
https://rfm.rsquaredacademy.com/
Other
59 stars 28 forks source link

Add date_most_recent to rfm_table_customer_*() output to allow its use in rfm_segment() #53

Closed leungi closed 5 years ago

leungi commented 5 years ago

As per subject, reprex below.

Thanks for the great work! I'll hack around your functions to achieve this for now :)

I believe adding the date_most_recent to rfm_table_customer_*() internal result variable should work.

library(rfm)
#> Warning: package 'rfm' was built under R version 3.5.3

    analysis_date <- lubridate::as_date('2008-01-01', tz = 'UTC')

    # fail
    rfm_result <- rfm_table_customer_2(rfm_data_customer, customer_id, number_of_orders,
                                       most_recent_visit, revenue, analysis_date)
    rfm_segment(rfm_result)
#> Error in .f(.x[[i]], ...): object 'date_most_recent' not found

    # works
    rfm_result <- rfm_table_order(rfm_data_orders, customer_id, order_date,
                                  revenue, analysis_date)
    rfm_segment(rfm_result)
#> # A tibble: 995 x 10
#>    customer_id segment rfm_score transaction_cou~ recency_days amount
#>    <chr>       <chr>       <dbl>            <dbl>        <dbl>  <dbl>
#>  1 Abbey O'Re~ Others        343                6          571    472
#>  2 Add Senger  Others        412                3          506    340
#>  3 Aden Lesch~ Others        323                4          560    405
#>  4 Admiral Se~ Others        433                5          498    448
#>  5 Agness O'K~ Others        555                9          456    843
#>  6 Aileen Bar~ Others        555                9          450    763
#>  7 Ailene Her~ Others        355                8          647    699
#>  8 Aiyanna Br~ Others        321                4          612    157
#>  9 Ala Schmid~ Others        212                3          715    363
#> 10 Alannah Bo~ Others        121                4          985    196
#> # ... with 985 more rows, and 4 more variables: date_most_recent <date>,
#> #   recency_score <int>, frequency_score <int>, monetary_score <int>

Created on 2019-04-11 by the reprex package (v0.2.1)

leungi commented 5 years ago

To be exact, in the rfm-table-customer-2.R script, lines 59-60

  result <-
    data %>%
    dplyr::mutate(
      recency_days = (analysis_date - !! recent_visit) / lubridate::ddays()
    ) %>%
    dplyr::select(!! cust_id, recency_days, !! order_count, !! revenues, !! recent_visit) %>%
    magrittr::set_names(c("customer_id", "recency_days", "transaction_count", "amount", "date_most_recent"))
aravindhebbali commented 5 years ago

Hi @leungi, I am unable to reproduce the error using the examples shown in the documentation here and here. Not sure if I am missing something.

library(rfm)

analysis_date <- lubridate::as_date('2007-01-01', tz = 'UTC')

# access rfm table
result <-
  rfm_table_customer_2(
    rfm_data_customer,
    customer_id,
    number_of_orders,
    most_recent_visit,
    revenue,
    analysis_date
  )

result$rfm
#> # A tibble: 39,999 x 8
#>    customer_id recency_days transaction_cou~ amount recency_score
#>          <dbl>        <dbl>            <dbl>  <dbl>         <int>
#>  1       22086          232                9    777             2
#>  2        2290          115               16   1555             4
#>  3       26377           43                5    336             5
#>  4       24650           64               12   1189             5
#>  5       12883           23               12   1229             5
#>  6        2119           72               11    929             5
#>  7       31283          112               17   1569             4
#>  8       33815          142               11    778             3
#>  9       15972           43                9    641             5
#> 10       27650          131               10    970             3
#> # ... with 39,989 more rows, and 3 more variables: frequency_score <int>,
#> #   monetary_score <int>, rfm_score <dbl>

segment_names <-
  c(
    "Champions",
    "Loyal Customers",
    "Potential Loyalist",
    "New Customers",
    "Promising",
    "Need Attention",
    "About To Sleep",
    "At Risk",
    "Can't Lose Them",
    "Lost"
  )

recency_lower <- c(4, 2, 3, 4, 3, 2, 2, 1, 1, 1)
recency_upper <- c(5, 5, 5, 5, 4, 3, 3, 2, 1, 2)
frequency_lower <- c(4, 3, 1, 1, 1, 2, 1, 2, 4, 1)
frequency_upper <- c(5, 5, 3, 1, 1, 3, 2, 5, 5, 2)
monetary_lower <- c(4, 3, 1, 1, 1, 2, 1, 2, 4, 1)
monetary_upper <- c(5, 5, 3, 1, 1, 3, 2, 5, 5, 2)

rfm_segment(
  result,
  segment_names,
  recency_lower,
  recency_upper,
  frequency_lower,
  frequency_upper,
  monetary_lower,
  monetary_upper
)
#> # A tibble: 39,999 x 9
#>    customer_id segment rfm_score transaction_cou~ recency_days amount
#>          <dbl> <chr>       <dbl>            <dbl>        <dbl>  <dbl>
#>  1       22086 Lost          222                9          232    777
#>  2        2290 Loyal ~       455               16          115   1555
#>  3       26377 New Cu~       511                5           43    336
#>  4       24650 Loyal ~       544               12           64   1189
#>  5       12883 Loyal ~       545               12           23   1229
#>  6        2119 Loyal ~       543               11           72    929
#>  7       31283 Loyal ~       455               17          112   1569
#>  8       33815 Others        342               11          142    778
#>  9       15972 Potent~       522                9           43    641
#> 10       27650 Need A~       333               10          131    970
#> # ... with 39,989 more rows, and 3 more variables: recency_score <int>,
#> #   frequency_score <int>, monetary_score <int>

Created on 2019-05-03 by the reprex package (v0.2.1)

leungi commented 5 years ago

Thanks for prompt response @aravindhebbali; apologies for my delay.

I pulled the dev version of rfm and was it worked as you had demonstrated. I was using version 0.2.0 previously from CRAN, and 0.2.0.9000 did the trick.

Appreciate your work! Look forward to more functionalities 👍