LSSTDESC / qp

Quantile Parametrization for probability distribution functions module
MIT License
10 stars 3 forks source link

Comprehensive normalization utilities #147

Open aimalz opened 1 year ago

aimalz commented 1 year ago

normalize_interp1d only covers the overall integral. We should also have a generic normalization function that includes dealing with negative values (two possible ways?). This should be accompanied by a quant_gen-specific normalization method that checks that the integral between quantile locs satisfies the provided CDF values. Additionally, we need to renormalize if converting to a parameterization (e.g. interp or hist) that has a restricted range compared to the original to ensure the modified PDFs still normalize to 1.

eacharles commented 1 year ago

I'm not understanding what you want.

aimalz commented 1 year ago

Sorry that came out a bit scattered. Let me try that again:

Parameterizations that define a generic 1D function (e.g. spline, interp1d, etc.) rather than a function that inherently adheres to the definition of a PDF need an automatic check for and rectify violations of nonnegativity, in addition to integrability to unity. I labeled this issue as a bug because the presence of negative values or failure to integrate to unity over the provided range means that the ensembles are not actually filled with PDFs, with downstream effects including but not limited to nonsensical values for the metrics. This check should be performed when instantiating Ensembles under those parameterizations from user-provided values and when converting Ensembles to those parameterizations (from any starting parameterization, including the affected ones under different metadata values). I mentioned the limited existing functionality for integrability as a potential starting point in case it's deemed more advantageous to generalize that than to build something new.

The quantile-specific note addresses a special case of this check on the options for how to evaluate the .pdf() method, which may result in a violation of the definition of a PDF by either criteria above as well as the definition of the instance of the quantile parameterization in question via the integral between adjacent data values not necessarily being equal to the corresponding metadata values. However, it would be more appropriate for this to be a sub-item on #148 rather than lumped into this issue.