Closed chengsoonong closed 7 years ago
The longer term goal of this issue is to see whether the examples in the test set that are located in low density regions are also predicted with high uncertainty by the Gaussian process regressor.
If this is the case, then we could potentially use the KDE in conjunction with the SGDRegressor to simulate a predictor with uncertainty.
See issue #77
To sanity check your KDE, plot the density in terms of the RA and DEC. I.e. a 2 dimensional surface. Because you have so many points, you will have to do a 2D histogram. Compare the results to Alasdair's in (which uses a hex_map): https://github.com/chengsoonong/mclass-sky/blob/master/projects/alasdair/notebooks/02_exploratory_analysis.ipynb
Plot each pair of bands against the predicted density. Compare against actual density.
Resolved in eb2537f
Use Kernel Density Estimation (KDE) on the training set to estimate the location of the data in feature space. Predict the value of the density for each example in the test set.
http://scikit-learn.org/stable/modules/density.html