JeffreyRacine / R-Package-np

R package np (Nonparametric Kernel Smoothing Methods for Mixed Data Types)
https://socialsciences.mcmaster.ca/people/racinej
47 stars 18 forks source link

Error running npcdens() with bwtype="generalized_nn" #17

Closed mjuraska closed 6 years ago

mjuraska commented 6 years ago

Hello Jeff, I have an object named fbw of class condbandwidth produced by npcdensbw() with bwtype="generalized_nn". Here's the content of fbw:

Conditional density data (160 observations, 2 variable(s))
(1 dependent variable(s), and 1 explanatory variable(s))

                         S
Dep. Var. Bandwidth(s): 29
                        Sb
Exp. Var. Bandwidth(s): 13

Bandwidth Selection Method: Maximum Likelihood Cross-Validation
Formula: S ~ Sb
Bandwidth Type: Generalized Nearest Neighbour
Objective Function Value: 46.55313 (achieved on multistart 2)

Continuous Kernel Type: Second-Order Epanechnikov 
No. Continuous Dependent Vars.: 1 
No. Continuous Explanatory Vars.: 1

I'm getting the following error when running npcdens(fbw):

> npcdens(fbw)
Error in npcdens.conbandwidth(txdat = txdat, tydat = tydat, bws = bws,  : 

** Error: invalid bandwidth.

Would you please clarify if the bandwidth type generalized_nn is supported by npcdens() and if so, what the source of the error might be? Thank you very much!

JeffreyRacine commented 6 years ago

Greetings,

Using the most recent version, the following runs for me...

library(np)
data(cps71)
attach(cps71)
## Direct
foo <- npcdens(logwage~age,bwtype="generalized_nn")
## Two-step
bw <- npcdensbw(logwage~age,bwtype="generalized_nn")
foo <- npcdens(bw)

Perhaps email your data and code?

mjuraska commented 6 years ago

Thanks, your example runs for me, too. Here's my code that yields the error:

> library(np)
> load("dat.RData")
> npcdens(S ~ Sb, data=dat, bwtype="generalized_nn")

Error in npcdens.conbandwidth(txdat = txdat, tydat = tydat, bws = bws,  : 

** Error: invalid bandwidth.

The data frame dat is in the attached dat.zip. Oddly enough, when I rerun this code multiple times, on occasion it doesn't fail and instead prints out this summary:

Conditional Density Data: 160 training points, in 2 variable(s)
(1 dependent variable(s), and 1 explanatory variable(s))

                        S
Dep. Var. Bandwidth(s): 1
                        Sb
Exp. Var. Bandwidth(s):  1

Bandwidth Type: Generalized Nearest Neighbour
Log Likelihood: -230.1141

Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1
No. Continuous Dependent Vars.: 1

dat.zip

JeffreyRacine commented 6 years ago

Greetings,

Not crashing for me but I see the problem... you are treating Sb and S as continuous... they are not... you want to estimate a probably function not a density function...

> unique(dat$S)
[1] 4 5 3 6 7 1 2
> unique(dat$Sb)
[1] 2 3 0 5 4 7 6 1

You need to treat them as factors (probably ordered makes sense)... try

model <- npcdens(ordered(S) ~ ordered(Sb), data=dat)

to force it to use a discrete support kernel...

summary(model)

plot(model)

mjuraska commented 6 years ago

Thank you so much! I appreciate your time and input.