codedthinking / Kezdi.jl

Julia package for data manipulation and analysis
https://codedthinking.github.io/Kezdi.jl/
Other
21 stars 0 forks source link

bug: handle missing values in regression formula #129

Open korenmiklos opened 4 days ago

gergelyattilakiss commented 2 days ago

started working on it I want to patch regress through the same function as summarize. Noticed we did not used Kezdi.regress previously.

korenmiklos commented 2 days ago

Great. Pls create a branch for it and push frequently. The trick with regress is that we cannot drop missing variable by variable. Get all the variables out of the formula, and if any of the variables is !isvalue, drop the entire row. So.

DataFrame(x1 = [1, 2, missing, 4], x2 = [5, missing, 7, 8]) should become DataFrame(x1 = [1, 4], x2 = [5, 8]) and not DataFrame(x1 = [1, 2, 4], x2 = [5, 7, 8])

(Actually here is your first test case.)