Closed jsdillon closed 9 months ago
I've verified that this is correct with a numerical experiment. I made Nsamples flat in time and frequency for both pols, but I gave one pol 10x the samples of the other and then generated pure thermal noise.
Here's what the current Nsamples for pseudo-I gives:
And here's what happens if you do 4 * (uvd1.nsample_array**-1 + uvd2.nsample_array**-1)**-1
:
https://github.com/HERA-Team/hera_pspec/blob/fc06cf7d1b8ba370e4194b02eb49a1b08b757152/hera_pspec/pstokes.py#L148
I'll repeat what I said in Slack:
In the case where ee- and nn-polarized visibilities have different nsamples (because of different antennas being flagged on different days, for example), it’s not clear to me that this is the proper nsamples for computing the thermal noise. Consider a case where
uvd1.nsample_array
is 10 anduvd2.nsample_array
is 1. The variance will be dominated byuvd2.nsample_array
. So if we want nsamples to properly reflect the variance, then I think we want to do something likeuvdS.nsample_array = 4 * (uvd1.nsample_array**-1 + uvd2.nsample_array**-1)**-1
where the 4 accounts for the fact thatuvdS.data_array = uvd1.data_array / 2 + uvd2.data_array / 2
. In other words, if the ee nsamples and the nn nsamples are equal, then the answer is just the sum of the nsamples. But if one of the two is zero, then the subsequent nsamples should also be 0 reflecting the fact that psuedo-I and psuedo-Q have infinite variance