JeffreyRacine / R-Package-np

R package np (Nonparametric Kernel Smoothing Methods for Mixed Data Types)
https://socialsciences.mcmaster.ca/people/racinej
47 stars 18 forks source link

Bvcov in npindex.sibandwidth #44

Open kaisun84 opened 1 year ago

kaisun84 commented 1 year ago

I am recently studying the single-index semiparametric smooth-varying coefficient model, and so looked into the "np.singleindex.R" source code (from: https://cran.r-project.org/src/contrib/np_0.60-17.tar.gz ) in the "np" package in order to get some inspiration of programming the (heteroskedasticity-robust) standard errors of the parameters of the linear single index in the smooth coefficients of the above model. However, this triggered the following questions about calculating the variance-covariance matrix of \beta, i.e., “Bvcov”, in the “npindex.sibandwidth” function.

If you could find in the source code file (np.singleindex.R),

Line 1: dg.db.xmex <- index.tgrad[,1]*xmex

where "index.tgrad[,1]" is a n times 1 column vector, and "xmex" is a k times n matrix with k > 1, i.e., when is.vector(xmex) returns FALSE. k is the number of parameters whose standard errors are to be estimated, and n is sample size.

It appears that Line 1 is not the same as

Line 2: dg.db.xmex <- t(index.tgrad[,1]*t(xmex))

Note the quantity "dg.db.xmex" generated by these two lines has the same dimension, but different elements. R is able to compute Line 1 (without an error message or interruption) because (kn)/n = k, i.e., longer object length is exactly a multiple of shorter object length. However, Line 1 seems problematic because R performs element-by-element multiplication ("*") of a vector and a matrix by column (vertically) of the matrix with two or more rows, rather than by row (horizontally) of the matrix --- e.g., (1:3)*rbind(1:3,4:6), or equivalently, rbind(1:3,4:6)\(1:3), would return rbind(c(1,6,6),c(8,5,18)), instead of the desired rbind(c(1,4,9),c(4,10,18)). Since the point is to multiply each column of (X-E(X|X\beta)), which is the transpose of "xmex", by the gradient vector "index.tgrad[,1]", and so one should use Line 2, i.e., transpose "xmex" so that it becomes n times k first, before multiplying it by the n times 1 gradient vector, to get the variance-covariance matrix as on page 94 in Ichimura (1993)?

Similarly, in the same source code file again

Line 3: Sigma <- (uhat*dg.db.xmex)%%t(uhat\dg.db.xmex)

where “uhat” is n times 1 column vector, and “dg.db.xmex” (like “xmex”) is a k times n matrix. If k = 1, then that's fine. But for k > 1, shouldn't we transpose “dg.db.xmex” first so it becomes n times k before using the element-wise multiplication as in, for example,

Line 4: Sigma <- (t(uhat*t(dg.db.xmex))) %% t(t(uhat\t(dg.db.xmex)))

which is equivalent to

Line 5: Sigma <- (t(uhat*t(dg.db.xmex))) %% (uhat\t(dg.db.xmex))

Or do I miss something? Thanks for your time.

PS: In contrast, scroll down to the following line in the same file

dg.db <- txdat[,-1,drop=FALSE]*index.tgrad[,1]

where it is essentially an n times k matrix element-wise-multiply an n times 1 vector, which is okay.