[bump minor] P1d covariance fix

igmhub / picca

set of tools for continuum fitting, correlation function calculation, cosmological fits...

GNU General Public License v3.0

30 stars 22 forks source link

[bump minor] P1d covariance fix #1034

Closed corentinravoux closed 11 months ago

corentinravoux commented 11 months ago

Branch to fix the P1D computation. Two important fix;

The storing of k arrays for the covariance
The computation of errors for the P1D

This PR is also here to have various discussions about the covariance matrix calculation.

corentinravoux commented 11 months ago

Pytest is failing at the delta extraction step, I do not think it comes from this PR

iprafols commented 11 months ago

Yes, we know, tests have become unstable again, probably due to some updates (see issue #1032 ) @Waelthus is preparing a PR to fix this

Waelthus commented 11 months ago

The pytest is working again, is this all that you wanna do here? Or is this in progress for later review @corentinravoux?

corentinravoux commented 11 months ago

Thanks @Waelthus, For now the two fixes are really important.

corentinravoux commented 11 months ago

If we do not need further discussions on covariance, we can merge.

Waelthus commented 11 months ago

Regarding further discussion of how to compute covariances, maybe open an issue

corentinravoux commented 11 months ago

Thanks for your review, The point of the second fix is indeed to have different weight for the error and the average. With the fix Eric as done, only the P1D are used to do the weighting, and it gives the same error bar for all the columns. I fixed that by separating the weights with P1D (to compute the average) and the weights with col (to compute the error). I hope this answer most of you interogations, I will try to implement the changes you mentioned

Waelthus commented 11 months ago

But then I'm not sure if that's a useful thing to do. The part of the other columns that makes it into the final measurements is after all weighted in the same way as the P1d measurements. So the weights should be the same, but the absolute errors should not be. They should be based on the column at hand using the P1d weights...

corentinravoux commented 11 months ago

What do you mean is that we should compute the error with a weighted standard deviation instead of how it is actually done ? I.e. with error = np.sqrt(1.0 / np.sum(weights_col)) ? I do not think the results will be drastically modified, but in principle yes that could work. And it will simplify the calculation of the covariance. What I suggest is that we keep it that way for a tagged EDR+M2 version, and then we move this discussion to an issue with @armengau. Is it ok for you ?

Waelthus commented 11 months ago

What I mean is that we do not necessarily want to compute the mean and (co-)variance of a column based on it's own inverse variance weights, but based on the inverse variance weights of the P1d column and the means, variances of the column of interest, I think it should be something like:

Var(col)=sum(W_i**2 Var_i(col))/(sum(W_i))**2 (ignoring the covariance between modes) where W_i are the weights based on P1d, and Var_i are the individual variances inside that column. If the column is P1d, W_i=Var_i**-1 and then Var=sum(W_i)/(sum(W_i))**2=1/sum(W_i), i.e. for that column both approaches are the same as expected. Including covariances Var_i -> Cov_ij, W_i**2 -> W_i W_j and the sums are over both i and j...

If one did this, comparisons between different columns will be simplified, as the impact of each spectrum is the same regardless of column.

I'd be happy to merge the current version for now and discuss (co-)variance estimation in an issue with both of you.

Waelthus commented 11 months ago

and yes, please create a tag for the version you're using (if there is none already) and potentially also a release similar to this one to make clear that the tag was used for P1d in EDR: https://github.com/igmhub/picca/releases/tag/v7.0.3

corentinravoux commented 11 months ago

Yes, I think your definition is more straightforward. We should create an issue to implement that. I also think that we should do a note detailling all those statistical specificity. It might be necessary for Y1 and Y3

corentinravoux commented 11 months ago

I do not know how to create the tag and if I am allowed ? Let's call it 8.0.2 ?

Waelthus commented 11 months ago

so you want to use the version from after this merge as a tag?

corentinravoux commented 11 months ago

Yes if that is ok for you, including covariance calculation and all those fixes