bachmannpatrick / CLVTools

R-Package for estimating CLV
54 stars 14 forks source link

Walk creation looses AuxTrans #134

Closed pschil closed 8 months ago

pschil commented 4 years ago

If there are covariates only for the calibration period, AuxTrans are lost for customers who have an ordinary transaction after the last covariate date. Subsequently, the dyncov LL cannot be fit. The error seems to originate from foverlaps()/the interval join that apparently ignores any 2nd transaction after the end of the covariate. Not clear yet whether ordinary transactions (=walks) are lost as well.

Reprex:

clv.short <- clvdata(apparelTrans[Date <= "2005-12-31"], date.format = "ymd", time.unit = "w", estimation.split = NULL)
clv.short <- SetDynamicCovariates(clv.short, data.cov.life = apparelDynCov[Cov.Date <= "2005-12-31"], data.cov.trans = apparelDynCov[Cov.Date <= "2005-12-31"], names.cov.life = c("Marketing", "Gender", "Channel"), names.cov.trans = c("Marketing", "Gender", "Channel"), name.date = "Cov.Date")
l.walks <- CLVTools:::pnbd_dyncov_makewalks(clv.data = clv.short)
l.walks[[1]][[1]][AuxTrans == T, uniqueN(Id)] # is not 250

setdiff(clv.short@data.transactions[, unique(Id)], l.walks[[1]][[1]][AuxTrans == T, Id])
oblander commented 3 years ago

Source of the problem in line 44 in pnbd_dyncov_createwalks:

cov.dt[is.na(Next.Cov.Date.Minus.Eps), Next.Cov.Date.Minus.Eps := This.Cov.Start.Date]

The last period of covariates has its interval end set to the start time so transactions happening later than the start date are lost in the foverlap call. Simple fix is to set the interval end to be 1 period later (minus 1 second) consistent with all other periods of covariates, as follows:

cov.dt[is.na(Next.Cov.Date.Minus.Eps), Next.Cov.Date.Minus.Eps := This.Cov.Start.Date - 1L +
           switch(clv.time@name.time.unit, Hours = hours(1), Days = days(1), Weeks = weeks(1), Years = years(1))]

(sorry to submit as a comment rather than pull request, I'm unfamiliar with GitHub)

pschil commented 8 months ago

Thanks again for reporting!

This has been addressed in version 0.10 when we refactored the dyncov walk creation & implemented the LL in Rcpp. (a test with the above reprex was added)