aimalz / qp

Quantile Parametrization for probability distribution functions module
MIT License
3 stars 9 forks source link

Background text on baseline LSST DM plan #52

Closed drphilmarshall closed 6 years ago

drphilmarshall commented 7 years ago

The LSST DM "Data Products Definition" document, LSE-163, has the following to say about photo-z storage:

Colors of the object in “standard seeing” (for example, the third quartile expected survey seeing in the i band, ∼ 0.9”) will be measured. These colors are guaranteed to be seeing-insensitive, suitable for estimation of photometric redshifts. (Footnote: The problem of optimal determination of photometric redshift is the subject of intense research. The approach we’re taking here is conservative, following contemporary practices. As new insights develop, we will revisit the issue.)

We currently plan to provide [full photo-z posterior PDF] information ... by providing parametric estimates of the likelihood function. As will be shown in Table 4, the current allocation is ... ~100 parameters for describing the photo-Z likelihood distributions, per object. The methods of storing likelihood functions (or samples thereof) will continue to be developed and optimized throughout Construction and Commissioning. The key limitation, on the amount of data needed to be stored, can be overcome by compression techniques. For example, simply noticing that not more than ∼ 0.5% accuracy is needed for sample values allows one to increase the number of samples by a factor of 4. ... Advanced techniques, such as PCA analysis of the likelihoods across the entire catalog, may allow us to store even more, providing a better estimate of the shape of the likelihood function. In that sense, what is presented in Table 4 should be thought of as a conservative estimate, which we plan to improve upon as development continues in Construction.

So, a good baseline assumption is that we have 100 parameters per object to play with. Using fewer parameters would reduce the storage costs somewhat, and presumably speed up the computations too (although that would need investigating). This stuff should appear in the introduction of our Note.