statistikat / VIM

Visualization and Imputation of Missing Values
http://statistikat.github.io/VIM/
82 stars 15 forks source link

donorcond parameter in hotdeck #56

Closed merangelik closed 3 years ago

merangelik commented 3 years ago

https://github.com/statistikat/VIM/blob/1c3023bf7094197378ecd1416ba97120ee9e3c8d/R/hotdeck.R#L72

Is ist possible that the parameter donorcond is not actually implemented for hotdeck()? I can't find it in the code and it doesn't seem to be working.

GregorDeCillia commented 3 years ago

It seems support for donorcond was accidentally dropped in https://github.com/statistikat/VIM/commit/8cd71db30765f23966b2d8f1e8b7af389389e176.

GregorDeCillia commented 3 years ago

I am currently working on this but get an "out of range error" because of line 211 https://github.com/statistikat/VIM/blob/1c3023bf7094197378ecd1416ba97120ee9e3c8d/R/hotdeck.R#L208-L217

impPart denotes the index of missing value(s) and the hotdeck method looks for suitable replacements recursively by increasing add at every iteration. The issue occurs if add >= impPart and impPart + add > nrow(xx).

impPart <- 6
add <- 7
TFindex <- TRUE
impDon <- impPart
xx <- sleep[c(56, 59, 23, 48, 2, 36, 58, 52, 62, 31, 5, 1),  ]
v <- "Span"

(impDon[TFindex] <- impPart[TFindex]-add )
## -1
(impDon[TFindex][impDon[TFindex]<1] <- impPart[TFindex][impDon[TFindex]<1]+add)
#> 13
(impDon2 <- impDon[TFindex])
#> 13
data.frame(xx[impDon2,v,with=FALSE])[,1]
#> NULL

@alexkowa do you have any idea what might be wrong here? My suggestion would be a "rotational" logic. Something like this

if (impPart - add <= 0)
   impDon <- impPart - add + nrow(xx)
alexkowa commented 3 years ago

Yes, rotating through the data set makes absolutely sense. The error probably appears when a missing is close to the end of the data set (or by group).

GregorDeCillia commented 3 years ago

Thanks! In my case, the problem occurred since the only applicable donor had an index greater than impPart and without a proper rotation, this donor was never considered.