GalSim-developers / GalSim

The modular galaxy image simulation toolkit. Documentation:
http://galsim-developers.github.io/GalSim/
Other
224 stars 106 forks source link

Further development / testing of lensing engine #248

Closed rmandelb closed 11 years ago

rmandelb commented 12 years ago

As in the pull request for the initial version of the "lensing engine" for generating non-constant shear fields, there are some additional improvements and testing that are necessary in the next few months. Below is a list of those that are collected from the pull request, where those near the top of the list are more important than those at the bottom.

Please let me know if I've missed something or if you disagree about the relative importance of these.

rmandelb commented 11 years ago

Since there have been some twists and turns in this issue in the past few weeks, here's a revamped "to do" list compared to the one I posted when I started to think about this:

barnabytprowe commented 11 years ago

We would need them to be wrong in a way that doesn't drastically change the results for P(k) in section 5… is that possible? I have a hard time seeing how that would work considering that neither N nor 2pi is near 1.

That's true, but I think I remember seeing a suspicious division/multiplication by N rather than N^2 in the guts of the code. Maybe we don't need the 2pi.

Um... I don't understand what you mean. The P(k)=P0 case only works in the current version, with grid spacing normalization. In the old version it didn't work. See section 4.2 for details. Maybe I didn't explain this clearly enough?

No, then I misunderstood! OK, scratch what I said in that case, although I need to think about the meaning of it...

rmandelb commented 11 years ago

We would need them to be wrong in a way that doesn't drastically change the results for P(k) in section 5… is that possible? I have a hard time seeing how that would work considering that neither N nor 2pi is near 1.

That's true, but I think I remember seeing a suspicious division/multiplication by N rather than N^2 in the guts of the code. Maybe we don't need the 2pi.

Is it possible that you're referring to

    g1=g1_k.shape[0]*np.fft.irfft2(g1_k, s=(self.nx,self.ny))
    g2=g2_k.shape[0]*np.fft.irfft2(g2_k, s=(self.nx,self.ny))

?

That had originally worried me and I'm not sure what convinced me it was okay. Maybe it's not. But if that should be N^2 then our results will, again, be off by orders of magnitude, so there must be something else wrong.

barnabytprowe commented 11 years ago

That's exactly the line! I think this looks like it might then work out OK if it should be N^2 (which I think it should), since Delta k = 2 pi / L = 2pi / (N * Delta x) and the working seems to suggest we should be multiplying by Delta k. Will have to come back to this soon, unfortunately today is manic but I think we're making progress!

rmandelb commented 11 years ago

Ah, so maybe that line should have N^2, and

self.amplitude_E = np.sqrt(self._generate_power_array(p_E))/pixel_size

should be

dk = 2.*np.pi/(nx*pixel_size)
self.amplitude_E = np.sqrt(self._generate_power_array(p_E))*dk

and in the end, the difference from what we have now is that the amplitudes go up by a factor of 2pi and the PS go up by (2pi)^2. Implications for my tests:

(1) variance for Gaussian P(k): my results had been too high by a factor of 2pi, so now they will be too high by even more than that. Oops. Although, given the formalism you presented, my expectations in section 4.1 may have been wrong. I need to read more carefully to see whether that's true.

(2) variance for flat P(k): was right, will now be too high (unless my expectations were also wrong).

(3) P(k) itself: new results will be too high. I think.

Hmmm.

barnabytprowe commented 11 years ago

Hi Rachel,

I've pushed a modified version of the lensing_engine.tex document, which completes my derivation. In the end, I get that you were right all along simply to divide by the grid spacing d! So, given that I hadn't been keeping properly up-to-date with the flat P(k) case and didn't realise it was now working, this is something encouraging. This doesn't yet fix the discrepancy of 2 pi between case 1) and 2) above taken at face value. I'm at something of a loss to explain that, but this process was very useful for me, and so maybe I'll take a look back at Section 4.1 to see if I see anything that might explain it.

rmandelb commented 11 years ago

Okay, I really need to sit and digest this, but I'm happy that we didn't need a 2pi in there, as that makes 2/3 of our cases worse!

On Feb 11, 2013, at 9:17 PM, Barnaby Rowe notifications@github.com wrote:

Hi Rachel,

I've pushed a modified version of the lensing_engine.tex document, which completes my derivation. In the end, I get that you were right all along simply to divide by the grid spacing d! So, given that I hadn't been keeping properly up-to-date with the flat P(k) case and didn't realise it was now working, this is something encouraging. This doesn't yet fix the discrepancy of 2 pi between case 1) and 2) above taken at face value. I'm at something of a loss to explain that, but this process was very useful for me, and so maybe I'll take a look back at Section 4.1 to see if I see anything that might explain it.

— Reply to this email directly or view it on GitHub.


Rachel Mandelbaum rmandelb@andrew.cmu.edu http://www.andrew.cmu.edu/~rmandelb/

rmandelb commented 11 years ago

BTW, for case (1), the variance for a Gaussian P(k), I was making the predictions by integrating P(k) k dk (essentially a circle in k space) whereas i should have been integrating the square grid, which is how I found agreement for case (2), the flat P(k). So let me redo the predictions, it could be the Gaussian case is actually working okay, whcih would be a relief! (going to a talk now, but hopefully can do this tonight)

rmandelb commented 11 years ago

Okay, never mind. I just updated the doc again. I was completely wrong about the gaussian P(k) giving the wrong variance (because... I plugged in numbers to the wrong formula - sigh). However, I am now confused as to why the results agree with the prediction when integrating within a circle in k-space, since when I integrate within a square I get something that differs by a factor of 4. Maybe I'm just doing that calculation wrong, too!

rmandelb commented 11 years ago

@barnabytprowe , at some point you might find it useful to check out the new version of the lensing engine document on this branch. In particular, I started the rearrangements we discussed, and while I did not yet go through your section to check consistency of notation (that's my next piece of homework!), I did do that for the sections that I wrote, including some updates to the test results in sections 5 and 6.

barnabytprowe commented 11 years ago

Excellent, I'll take a look!

rmandelb commented 11 years ago

I should note that I got to work on this some more today (which I didn't think would happen). There are some updates to my test results, so I've basically reached the end of what I can do (I think). I also started going through your text and equations with an eye towards cleaning up notation and making the document into a form that could go onto master, but I got stuck after your equation 18. My question is in the doc, and I was wondering if you could clarify what's going on here? I have a feeling I've done something stupid but I can't figure out what. Must sleep...

barnabytprowe commented 11 years ago

The short answer is because of equation (15) having the 2pi on the RHS, and the fact that there is a summation on either side of (18). The long answer, and a thorough check, would require a better derivation than I have in my notebook, but would involve starting out by writing the integrals and considering the Dirac comb as done to get (15)... Do you want me to put that in?

rmandelb commented 11 years ago

Hmmm, no. Let me try working it out and I'll get back to you if I can't.

On Feb 19, 2013, at 7:45 AM, Barnaby Rowe notifications@github.com wrote:

The short answer is because of equation (15) having the 2pi on the RHS, and the fact that there is a summation on either side of (18). The long answer, and a thorough check, would require a better derivation than I have in my notebook, but would involve starting out by writing the integrals and considering the Dirac comb as done to get (15)... Do you want me to put that in?

— Reply to this email directly or view it on GitHub.


Rachel Mandelbaum rmandelb@andrew.cmu.edu http://www.andrew.cmu.edu/~rmandelb/

barnabytprowe commented 11 years ago

There is, of course, a chance I'm wrong!

rmandelb commented 11 years ago

Well I really hope not because with your calculations, everything has come together nicely… :) But I will check it out carefully.

On Feb 19, 2013, at 8:37 AM, Barnaby Rowe notifications@github.com wrote:

There is, of course, a chance I'm wrong!

— Reply to this email directly or view it on GitHub.


Rachel Mandelbaum rmandelb@andrew.cmu.edu http://www.andrew.cmu.edu/~rmandelb/

barnabytprowe commented 11 years ago

Hi Rachel, I've taken a good read through and I find this very encouraging! I have a couple of edits to make to my parts of the document, mostly to formalize the tone, which I'll do ASAP but are nothing major.

As for the slight ~3-5% deficits you are seeing in overall power, I think I have an understanding for that in our code but am not sure if the same logic applies to SHT. It might. Here goes in terms of our code:

For the last point, I'm doing a thought experiment. Imagine a shear at the edge of a grid realization: in a periodic realization this shear is in fact covariant with its neighbours on the opposite side of the grid, since it is periodic. The field overall is therefore more constrained by this covariance than if k_min were smaller. Thought about another way, the larger the overall grid region in real space the smaller the number of inappropriately correlated edge pixels like this relative to the total area.

I wonder if there is an analogue for eq(25) in spherical harmonic transforms, there must be!

barnabytprowe commented 11 years ago

(minor edit above, not sure if it makes it clearer...)

rmandelb commented 11 years ago

Hi Barney -

Thanks for checking this over.

First and most trivially, regarding fixing up / formalizing the writing, you are welcome to do so if you want, but otherwise I was going to do it as I follow through the equations. OTOH if you do it then probably it will be done sooner and then we can ask others to check this out too. So it's up to you.

Second, let me make sure I understand your argument regarding the variance / PS. You say: "You find that the binning in l/k (I'll just say Fourier) space empirically seems to be responsible for the deficits in variance and overall power: changing the grid spacing in real space is fine." That's right. If I change L and therefore dk=2pi/L, the PS/variance results change, whereas if I change dtheta and therefore the total k range spanned by the grid, the PS/variance results are unchanged.

Regarding your point about the way to think about this for a periodic grid: that thought experiment makes sense to me. However, I think I've gotten something confused about the math in Eq. 30: that's the predicted shear correlation function in the discrete, finitely-sampled case, and if I've understood you properly, the variance we get from that should just be xi with n=m=0 (yes?), which is variance=(1/L)^2 Sum(P[q,p]) where the summation is over pixels in our Fourier-space grid. So - correct me if I am wrong, but I think one way to explicitly confirm that our working explanation for the few-percent deficits in the power is correct would be for me to simply do this sum over the array of power values that gets generated within the code, and see how the results compare to the ones we're actually finding? I just did that for the test in section 5.3 (truncated flat P(k)) and the results I'm getting are 4% higher than what we expect from Eq. 10 (but the variances I actually find are ~4-5% lower than the prediction from Eq 10). Have I gotten something flipped around? I guess I must have.

In any case, this physical explanation for why the real-space grid size / Fourier space grid spacing is leading to lower variance doesn't explain the variance for the Gaussian P(k) case in section 5.2, does it? Hmmm.

As for the SHT, it's not clear to me that we need this explanation to carry over. Looking at the figures, the SHT results are above those from GalSim, and once we account for the effect of using too few numbers of ell values in the PS estimation (which seems to suppress the results and change the slope) then the SHT results might actually be okay, or at least closer to okay than the GalSim results.

barnabytprowe commented 11 years ago

Hi Rachel, made my very minor edits on this document. Only one thing to mention is 60601af, in which I changed your $k_x$, $k_y$ in the first Section to $k_1$, $k_2$. to match my usage: hope that's OK! It seemed like the lesser set of changes...

rmandelb commented 11 years ago

Thank you! Will take a look. No worries about the notation change, it makes more sense to change it in my section since I use that notation rather less than you did. Did you have any further thoughts about the final few percent discrepancy?

On Feb 25, 2013, at 7:22 AM, Barnaby Rowe notifications@github.com wrote:

Hi Rachel, made my very minor edits on this document. Only one thing to mention is 60601af, in which I changed your $k_x$, $k_y$ in the first Section to $k_1$, $k_2$. to match my usage: hope that's OK! It seemed like the lesser set of changes...

— Reply to this email directly or view it on GitHub.


Rachel Mandelbaum rmandelb@andrew.cmu.edu http://www.andrew.cmu.edu/~rmandelb/

barnabytprowe commented 11 years ago

Did you have any further thoughts about the final few percent discrepancy?

Nothing compelling I'm afraid...

Regarding your point about the way to think about this for a periodic grid: that thought experiment makes sense to me. However, I think I've gotten something confused about the math in Eq. 30: that's the predicted shear correlation function in the discrete, finitely-sampled case, and if I've understood you properly, the variance we get from that should just be xi with n=m=0 (yes?), which is variance=(1/L)^2 Sum(P[q,p]) where the summation is over pixels in our Fourier-space grid. So - correct me if I am wrong, but I think one way to explicitly confirm that our working explanation for the few-percent deficits in the power is correct would be for me to simply do this sum over the array of power values that gets generated within the code, and see how the results compare to the ones we're actually finding? I just did that for the test in section 5.3 (truncated flat P(k)) and the results I'm getting are 4% higher than what we expect from Eq. 10 (but the variances I actually find are ~4-5% lower than the prediction from Eq 10). Have I gotten something flipped around? I guess I must have.

Alas I can't see where!

In any case, this physical explanation for why the real-space grid size / Fourier space grid spacing is leading to lower variance doesn't explain the variance for the Gaussian P(k) case in section 5.2, does it? Hmmm.

No, and I'm still thinking... something must explain what we're seeing, we're so close..!

rmandelb commented 11 years ago

Well, one possibility would be to ask someone else to read this over carefully now that we think we have a coherent / consistent framework. We've both been staring at it for a while, after all, and that can make it hard to spot the obvious.

reikonakajima commented 11 years ago

I've been meaning to look it over for a while. It may not be till this weekend till I get to it.

barnabytprowe commented 11 years ago

Something did occur to me this morning in fact which might make for a minor change... Are we doing the right thing in equation 10? In the code, we only set the P(|k|=0) mode to zero, rather than P(k, 0) = P(0, k) = 0 for all k.

I think what we want to in fact be integrating over is the shaded region shown on the LHS of this rather hastily drawn diagram: IMG_20130226_152056

...whereas currently we're integrating over the shaded region on the RHS. This will in fact make the predicted variances a little larger, helping (perhaps) with the results in Section 5.2.

In Section 5.3, however, it will make the discrepancy worse. I'm really confused about my physical intuition here! Certainly the test case has a very sharp edge in Fourier space so we expect it to have a very large support in real space, and thus be susceptible to Fourier-finite-grid effects. But the maths is telling us that the variance should increase whereas my intuition says it should decrease due to these periodic overlaps... I don't get it!

rmandelb commented 11 years ago

On Feb 26, 2013, at 10:36 AM, Barnaby Rowe notifications@github.com wrote:

Something did occur to me this morning in fact which might make for a minor change... Are we doing the right thing in equation 10? In the code, we only set the P(|k|=0) mode to zero, rather than P(k, 0) = P(0, k) = 0 for all k.

I think what we want to in fact be integrating over is the region shown on the LHS of this rather hastily drawn diagram:

...whereas currently we're integrating over the region on the RHS. This will in fact make the predicted variances a little larger, helping (perhaps) with the results in Section 5.2.

Oooh, good point. I believe you are correct. The left side of the picture represents what we do in the code, the right side of the picture is what eq 10 is doing. So when I did the test that used the sum over the actual power array, that could be one reason why the results were higher than the variances predicted from Eq 10. I will need to check numerically whether the difference between the predictions and the results goes away if I account for this.

However, regarding section 5.3 (for which the results go in the other direction): I just realized a mistake in eq 38: if we say P(k>klim)=0, which is what I did for the power function given to GalSim, that's not the same as what that equation has done, which is saying that we set P to zero if kx or ky is separately>klim! Given that power function, essentially I'm saying the power is zero outside of a circle in k-space, so the variance should be (P_0 / pi^2) [pi klim^2 / 4 - kmin^2], i.e., subtracting the little square near k=0 from the area in the circle with radius klim. Given that we have klim=10 kmin, the predicted variance is [25 pi - 1] P_0 kmin^2 / pi^2. Here the [25pi-1] should be compared with the 81 in equation 40 - i.e., the 81 in equation 40 should, with this fix, be 77, which is 5% lower and therefore completely consistent with our empirical results. I think this new calculation accounts for both your suggestion here (regarding proper treatment of our P(0)=0 condition) and for my silliness in equation 38, so section 5.3 is fine.

So between what you sent and my realization just now, sections 5.2 and 5.3 might be totally fixed. That would make me very happy.

However, I don't think that explains section 6, does it? For this we might need to appeal to finite-gridding effects?

Perhaps this is completely stupid, but here's what I keep getting tripped up on: We say that our minimum k is a grid spacing dk = 2pi/L, and our maximum k is maxk=2pi/dx. For the purpose of relating this to an integral over a continuous power function, we had been setting limits in Fourier space that are dk (lower) and maxk (upper), with the k=0 mode set to zero. (Let's ignore for a moment the issue of setting the power to zero at k=0 vs. kx=ky=0 illustrated in your picture, and think about the 1d problem.) Should we instead be thinking of those as k ranges that are being set to zero, i.e., setting the k=0 mode to zero really means saying P=0 for k in the range -dk/2 < k < dk/2? And the power for the k=dk mode really is being interpreted as the power for dk/2 < k < 3dk/2? I guess this only matters if P various substantially within k ranges of size dk?

barnabytprowe commented 11 years ago

Perhaps this is completely stupid, but here's what I keep getting tripped up on: We say that our minimum k is a grid spacing dk = 2pi/L, and our maximum k is maxk=2pi/dx. For the purpose of relating this to an integral over a continuous power function, we had been setting limits in Fourier space that are dk (lower) and maxk (upper), with the k=0 mode set to zero. (Let's ignore for a moment the issue of setting the power to zero at k=0 vs. kx=ky=0 illustrated in your picture, and think about the 1d problem.) Should we instead be thinking of those as k ranges that are being set to zero, i.e., setting the k=0 mode to zero really means saying P=0 for k in the range -dk/2 < k < dk/2? And the power for the k=dk mode really is being interpreted as the power for dk/2 < k < 3dk/2? I guess this only matters if P various substantially within k ranges of size dk?

You might be right about that for all these cases! This would fit with my interpretation of the first expression in the RHS of equation 26: we are approximating the function as a sum of finite impulses of width dk in 1D (area dk^2 in 2D). This would also help towards explaining why any discrepancies we saw were reducing as kmin got smaller...

I guess this only matters if P various substantially within k ranges of size dk?

If P varies substantially between k and k + dk then we are in trouble anyway, since our sampled approximation to P(k) will be badly aliased and hence the real space functions will overlap badly...

No, I think this ultimately comes down to being a convention choice about what part of the impulse is marked by the delta function sample (kind of like crude quadrature): currently we're implicitly using the convention that the delta function marks the start of the impulse, so that the P(k=0) impulse sets the area in the curve up to kmin. We maybe ought to therefore be integrating up to kmax+kmin in the analytic variance integrals like equation 10. Actually, you'd get the same spatial area (between kmin/2 and kmax+kmin/2) if you chose the other natural convention you're suggesting of placing the sample at the centre of the impulse, too...

However, I don't think that explains section 6, does it? For this we might need to appeal to finite-gridding effects?

Still, yes...

rmandelb commented 11 years ago

No, I think this ultimately comes down to being a convention choice about what part of the impulse is marked by the delta function sample (kind of like crude quadrature): currently we're implicitly using the convention that the delta function marks the start of the impulse, so that the P(k=0) impulse sets the area in the curve up to kmin. We maybe ought to therefore be integrating up to kmax+kmin in the analytic variance integrals like equation 10. Actually, you'd get the same spatial area (between kmin/2 and kmax+kmin/2) if you chose the other natural convention you're suggesting of placing the sample at the centre of the impulse, too...

Hmm, really? I thought in the first case, the area is (kmax+kmin)^2 - kmin^2 = kmax^2 + 2_kmax_kmin = kmax^2 ( 1 + 2 kmin/kmax) and in the second case, it's (kmax+kmin/2)^2 - (kmin/2)^2 = kmax^2 + kmax*kmin = kmax^2 (1+kmin/kmax)

For our typical grid setup, kmin/kmax=0.01 so I guess this difference is only 1%. But still it's not quite identical.

It may indeed be a matter of convention, but the second choice (which we haven't been implicitly using) kind of makes more sense to me.

Of course, I think that if you're in the regime where this really matters then you're probably kind of screwed when it comes to doing something truly interesting with shear power spectra, no?

rmandelb commented 11 years ago

Just wanted to note that I just pushed an updated document with the results of the above discussion in the theory section, and in the variance tests. They are now both completely consistent with some finite-gridding effects.

Still haven't done the other stuff I said I'd do (cleanup, unit tests). When we talk tomorrow we can discuss the rest of the work on this issue, but I'm pretty happy with these calculations now.

rmandelb commented 11 years ago

@barnabytprowe : Per our discussion yesterday, I've done the following here:

As we discussed, I'd like to make a pull request to get others looking at this, though I still haven't finished going over the equations. Please let me know if you think this branch is ready for a PR.

We would still need a subsequent issue to validate the interpolation of shears to a non-gridded position, but since I've switched to a focused "do things for GREAT3 only" mode, I don't intend to do that here.

barnabytprowe commented 11 years ago

Hi Rachel, sorry this morning has been hectic but I'll take a quick look over this branch this pm. I fully expect it to be worthy of a PR, but will make edits first if I spot anything...

rmandelb commented 11 years ago

No worries, it's not a huge rush.

barnabytprowe commented 11 years ago

Hi Rachel,

I've pushed a couple of extremely minor edits to the document, I think I'm done there now. I also made a very minor edit to test_lensing.py, which was to move the hard coded klim = 0.00175 value up to a single setting at the top of the script. I'm aware there are other magic numbers imbedded in the code, but that one seemed like a sensible one to move outside as it also gets used within a function scope and as your comment pointed out this would get missed if only editing the test that calls that function.

I had one other thing to mention, which was in regards to the discrepancy in g1 and g2 variances pointed out in the end of Section 5.2. I believe this is because we are only setting P_E = P(k), and P_B =0. If we were to split the P(k) equally between E and B modes, e.g. P_E = 0.5 P(k) and P_B = 0.5 P(k) we would see equal variances in the two components.

The reason for this was referred to obliquely by Joe some point much earlier in this thread I think... It's unavoidable, and because of the following lines in the code that are inherent in the E/B decomposition:

        #Now convert from E,B to g1,g2  still in fourier space
        g1_k = self._cos*E_k - self._sin*B_k
        g2_k = self._sin*E_k + self._cos*B_k

where self._cos, self._sin are returned by

def _generate_spin_weightings(self):
        #Internal function to generate the cosine and sine spin weightings for the current 
        #array set up
        C=np.zeros((self.nx,self.ny/2+1))
        S=np.zeros((self.nx,self.ny/2+1))
        kx = self.kx
        ky = self.ky
        TwoPsi=2*np.arctan2(1.0*self.ky, 1.0*self.kx)
        C[kx,ky]=np.cos(TwoPsi)
        S[kx,ky]=np.sin(TwoPsi)
        C[-kx,ky]=-np.cos(TwoPsi)
        S[-kx,ky]=np.sin(TwoPsi)
        return C,S

Basically there are higher k-values along the diagonals (TwoPsi = 90 degrees) than along the vertical/horizontals, so the sine terms in g1_k and g2_k get greater weighting and hence a boost to g2_k if B_k=0.

Does that make sense?

rmandelb commented 11 years ago

I've pushed a couple of extremely minor edits to the document, I think I'm done there now.

Looks good! I also made a very minor edit to test_lensing.py, which was to move the hard coded klim = 0.00175 value up to a single setting at the top of the script. I'm aware there are other magic numbers imbedded in the code, but that one seemed like a sensible one to move outside as it also gets used within a function scope and as your comment pointed out this would get missed if only editing the test that calls that function.

Yes, very good point, that makes a lot more sense. I had one other thing to mention, which was in regards to the discrepancy in g1 and g2 variances pointed out in the end of Section 5.2. I believe this is because we are only setting P_E = P(k), and P_B =0. If we were to split the P(k) equally between E and B modes, e.g. P_E = 0.5 P(k) and P_B = 0.5 P(k) we would see equal variances in the two components.

Yes. My complaint in that section was poorly spelled out: I had found Joe's argument compelling and mainly wanted to quantify how important the effect was in practice. So I have updated again to clarify that it's a feature, not a bug, and also verified explicitly that if you have equal E and B power then it goes away.

I've taken your message to mean that those are the last points you suggest before making a pull request, so I'm going to make one now.

rmandelb commented 11 years ago

Closing #248.