Closed saketkc closed 9 years ago
Hi.
Not a bug… check the bandwidth (undermoothed) and see the FAQ… thanks!
— Jeff
On Oct 25, 2015, at 1:01 AM, Saket Choudhary notifications@github.com wrote:
It seems that Epanechnikov kernel with least squares cross-validation has a bug. See: http://stats.stackexchange.com/questions/176906/np-package-kernel-density-estimation-with-epanechnikov-kernel http://stats.stackexchange.com/questions/176906/np-package-kernel-density-estimation-with-epanechnikov-kernel — Reply to this email directly or view it on GitHub https://github.com/JeffreyRacine/R-Package-np/issues/9.
Thanks. Just for future reference this is stated in FAQ 2.31 at https://cran.r-project.org/web/packages/np/vignettes/np_faq.pdf
Yes, but you might also mention the question as faq numbers can change.... Thanks!
Professor J. S. Racine Phone: (905) 525 9140 x 23825
Department of Economics McMaster University e-mail: racinej@mcmaster.ca 1280 Main St. W.,Hamilton, URL: www.economics.mcmaster.ca/racine
Ontario, Canada. L8S 4M4
`The generation of random numbers is too important to be left to chance'
On Oct 25, 2015, at 08:44, Saket Choudhary notifications@github.com wrote:
Thanks. Just for future reference this is stated in FAQ 2.31 at https://cran.r-project.org/web/packages/np/vignettes/np_faq.pdf � Reply to this email directly or view it on GitHub.
I use plot() (npplot()) to plot, say, a density and the resulting plot looks like an inverted density rather than a density
This can occur when the datadriven bandwidth is dramatically undersmoothed. Data-driven (i.e., automatic) bandwidth selection procedures are not guaranteed always to produce good results due to perhaps the presence of outliers or the rounding/discretization of continuous data, among others. By default, npplot() takes the two extremes of the data (minimum, maximum i.e., actual data points) then creates an equally spaced grid of evaluation data (i.e., not actual data points in general) and computes the density for these points. Since the bandwidth is extremely small, the density estimate at these evaluation points is correctly zero, while those for the sample realizations (in this case only two, the min and max) are non-zero, hence we get two peaks at the edges of the plot and a flat bowl equal to zero everywhere else. This can also happen when your data is heavily discretized and you treat it as continuous. In such cases, treating the data as ordered may result in more sensible estimates
It seems that Epanechnikov kernel with least squares cross-validation has a bug. See: http://stats.stackexchange.com/questions/176906/np-package-kernel-density-estimation-with-epanechnikov-kernel