numerical issue for some realizations of atmosphere

esheldon commented 5 months ago

I'm seeing large offsets in the photon x, y values, as well as the occasional nan, for some realizations of an atmosphere. The offsets can be very high, like 10 million pixels.

Below are example atmospheric parameters that produce the problem.

Using robust statistics on the output photons indicates that most photons land in a reasonable location, with a set of outliers

I can also provide a full example imsim config that produces this set of atm parameters

{'L0': [17.174188473465833,
        17.174188473465833,
        17.174188473465833,
        17.174188473465833,
        17.174188473465833,
        17.174188473465833],
 'altitude': [1.4302596307983027,
              5.348763607590827,
              11.208179050816556,
              18.24311343973453,
              24.467319796805825,
              30.175959028659584],
 'direction': [coord.Angle(5.023031335581778, coord.radians),
               coord.Angle(5.4138391122102645, coord.radians),
               coord.Angle(5.213667959962739, coord.radians),
               coord.Angle(5.043782770080258, coord.radians),
               coord.Angle(5.363918865113014, coord.radians),
               coord.Angle(5.075185823976301, coord.radians)],
 'r0_500': 0.08161434310901314,
 'r0_weights': [5.531557367962777e-13,
                4.3652382380659045e-14,
                1.326839454952542e-13,
                4.126186570683441e-14,
                2.344518516479255e-14,
                2.5838659853736628e-15],
 'rng': galsim.BaseDeviate('1778941255 1431167650 1282975060 ... 4184287197 2656677445 2301135708'),
 'screen_scale': 0.1,
 'screen_size': 800.0,
 'speed': array([10.94986395, 17.84175431, 37.29034581, 42.9768206 , 19.97968586,
       10.84201688])}

rmjarvis commented 5 months ago

I think I need more details to reproduce this. I tried to turn this set of kwargs into a unit test, but it doesn't fail yet. Can you try to adjust this based on other details of your run until it does fail? Then Josh or I can have a better shot at diagnosing and fixing it.

My first attempt is on branch '#1280' here: https://github.com/GalSim-developers/GalSim/blob/%231280/tests/test_phase_psf.py#L1557

rmjarvis commented 5 months ago

Note: If the issue is just the rng, the repr isn't sufficient to serialize it, since the repr string is really long, and that usually gets annoying to see long rngs all over the place. You can get a full evalable string for the rng seed with rng.serialize().

esheldon commented 5 months ago

I can't get this to fail even either, even with the full serialized rng.

It's complex because the same atmosphere gets used for all the objects in the image, and there are many different star SEDs being used. I see the problems for some fraction of the objects.

I could give you something that will reproduce using an imsim run. I would give you the input data. You would also need to use my hacked up imsim so it will printout offset statistics as well as when it hits a nan.

rmjarvis commented 5 months ago

Maybe you could print out the repr of the object whose drawImage call produces the nans? And also any other kwargs being sent to drawImage. That should have enough information to reproduce it I think.

esheldon commented 5 months ago

I'll try that. I'll also need to repr the photon_ops in that case, since that is how the PSF is implemented

esheldon commented 5 months ago

It is not easy to get a perfect reproduction. There must be some internal settings from the imsim run that affect the results.

Also note I had to turn off sensor, which made the run too slow. It is not that slow in the imsim run so, again, there must be some settings I'm not aware of.

Some objects from imsim do not have a full repr and cannot be pickled, so I tried to reconstruct things by hand when necessary.

I've attached a tarball that has data as well as a file test.py that can be run with pytest -vv test.py and will see failures of the tests for nan and large offsets.

I wasn't sure what an acceptable range for offsets should be. In this example the max offset I saw was 600_000 pixels (120_000 arcseconds). I have a check right now at 10_000 pixels, but we should adjust it.

nan-example.tar.gz

esheldon commented 5 months ago

I'm aware that this can't be incorporated into the galsim test suite due to use of imsim, but it wasn't obvious to me how to reconstruct the photon_ops without using imsim code

rmjarvis commented 5 months ago

Thanks. I'll work on minimizing the example to make a unit test from it.

rmjarvis commented 5 months ago

So the first thing I'm noticing here is that you have RubinDiffractionOptics() in the photon_ops list. So that explains why you are getting large offsets. You should expect large offsets from the diffraction spikes. And I suspect that the NaNs are also due to that. If I remove that item from the photon_ops, the NaNs go away too. I'll try to track down where the NaNs first show up. But I think this is most likely an imsim bug, not a GalSim one. (I won't close this until I confirm though.)

rmjarvis commented 5 months ago

@jmeyers314 Josh, I tracked this down to a batoid function. Specifically, refract in trace.py. But it happens in the C layer, so I'm passing the baton to you now. One of the ray vectors has the following values before the _batoid.refract call:

x,y,z,vx,vy,vz,t,wave,flux,vig,fail = 1.4136048029448713 -1.26631532296121 0.5702413527436976 -0.1632631352399052 0.34430459097839733 0.5688448339442601 17.841282831343328 5.356478398425508e-07 0.3783200003476515 True False

After refract:

x,y,z,vx,vy,vz,t,wave,flux,vig,fail = 1.619778407831917 -1.701113531367937 -0.20811303793564317 nan nan nan 16.578452709992995 5.356478398425508e-07 0.3783200003476515 True False

More details:

surface =  Sphere(-13.36)
ct =  CoordTransform(CoordSys(array([0.        , 0.        , 4.34206865]), array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])), CoordSys(array([0.        , 0.        , 4.40206865]), array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])))
m1 =  SellmeierMedium((0.69618302, 0.407925877, 0.897464057, 0.00467926519, 0.0135122244, 97.9323636))
m2 =  ConstMedium(1.0)
rv =  RayVector(array([0.06317926, 0.05853737, 0.0911232 , ..., 0.0587624 , 0.05450346,
       0.09584311]), array([-0.03250948, -0.03895119, -0.0603161 , ..., -0.03323284,
       -0.03971611, -0.03526725]), array([0.00079655, 0.00078003, 0.00188413, ..., 0.00071907, 0.00071758,
       0.0016456 ]), array([ 0.05435319,  0.08560945, -0.12966972, ...,  0.08319322,
        0.1107581 , -0.15863831]), array([-0.10171135, -0.0591399 ,  0.08145143, ..., -0.09638325,
       -0.05337458, -0.08124123]), array([0.67436141, 0.67677561, 0.66551091, ..., 0.67180015, 0.66982595,
       0.66125685]), array([17.64212032, 17.64272894, 17.59542174, ..., 17.64550459,
       17.64643467, 17.60539304]), array([5.12374699e-07, 5.37952130e-07, 4.67085877e-07, ...,
       4.96827704e-07, 4.16832107e-07, 5.44127040e-07]), array([0.37832, 0.37832, 0.37832, ..., 0.37832, 0.37832, 0.37832]), array([False, False, False, ..., False, False, False]), array([False, False, False, ..., False, False, False]), CoordSys(array([0.        , 0.        , 4.34206865]), array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])))

So it looks like the diffraction sent one ray out past the edge of the focal plane. It has vv large x,y,z values and it's marked as vignetted. That should be fine, but something in here is turning vx,vy,vz into nans. Hopefully that will be enough for you to see what might be wrong.

jmeyers314 commented 5 months ago

Thanks, I'll take a look. Note though that if it's marked vignetted, it shouldn't end up in the final image (flux should be set to 0 inside ImSim somewhere).

rmjarvis commented 5 months ago

Probably so. I think Erin was doing an empirical centroid with them just took the mean of the x and y values (which had gotten turned into NaNs subsequent to this step). So Erin, probably if you exclude the flux=0 photons, the centroids will be finite. But they still aren't doing what you want, since the centroid of the diffraction spikes are going to be quite noisy.

jmeyers314 commented 5 months ago

Looks like this particular refraction is actually a case of total-internal-reflection, which I haven't properly addressed in batoid. In lieu of actually reversing the order of the trace, I'll add code to batoid to mark such cases as failed for now.

esheldon commented 5 months ago

All the im.photons.flux are set to nan in all cases

esheldon commented 5 months ago

Sorry, not all cases actually. A significant fraction.

esheldon commented 5 months ago

About 0.3% are not set to nan

esheldon commented 5 months ago

Does galsim set the flux to nan when the photons are off the stamp? It looks like I messed up the stamp bounds/center and the object location is off the stamp

rmjarvis commented 5 months ago

In the test script you sent, at least for the first object that fails the nan test, there is only one photon with NaNs (in x, y, dxdz, dydz, pupil_u, and pupil_v), and its flux is set to 0. There are other photons with 0 flux as well, so probably they are also vignetted, but didn't have the NaN problem. I don't know where the flux would be getting set to NaN. There certainly isn't any place where I would do that on purpose. But it's possible there is a calculation somewhere that would propagate NaN from one of these values into the flux.

esheldon commented 5 months ago

The atmospheric PSF run (all else the same) has a separate population of large offsets not seen for the gaussian PSF maxoffs

esheldon commented 5 months ago

(sorry the units are pixels not arcsec)

rmjarvis commented 5 months ago

Did you use diffraction spikes with the Gaussian PSF? If not, then this is expected.

esheldon commented 5 months ago

Yes, the only difference between these two is replacing the atmosphere with a gaussian

rmjarvis commented 5 months ago

Are these the maximum offset of all the photons? Or just of the non-vignetted ones? (I.e. with flux > 0) I could imagine that the atmosphere might tend to put a few photons outside the visible part of the aperture more often than a simple PSF like Gaussian would. These might then do something weird in the diffraction code, but get marked as flux=0.

esheldon commented 5 months ago

These big offsets are vignetted, they have offsets of a thousand meters.

rmjarvis commented 5 months ago

OK. Then I don't think it's something we need to worry about. I don't think it's a bug. Just some photons going through the Rubin optics end up reflecting or diffracting to a place that ends up completely off the focal plane. I think it's probably not surprising that this is more likely when you have a realistic atmosphere than when you have a Gaussian for that part.

esheldon commented 5 months ago

What's the physics that connects the PSF of the atmosphere to deflection angle of the optical system?

rmjarvis commented 5 months ago

The pupil plane u,v values are set in the atmospheric code. Those positions impact what happens with the diffraction and subsequent Rubin optics. I'm not sure where they get set (if they do get set at all) when the PSF is just a Gaussian. But it seems plausible to me at least that having realistic pupil plan positions would trigger the kinds of optics that lead to scattered light more often than whatever the Gaussian does with them.

Josh should weigh in on this, since he understands this code (much) better than I do.

jmeyers314 commented 5 months ago

I guess I'm a little surprised that the atmospheric deflections are so large. Though I can't remember the order of atmosphere, optics, diffraction spikes photon ops now. The diffraction in particular I think can throw photons on wildly different trajectories and if that happens before the optics then I can imagine running into situations where the photons end up with extreme angles. The 2nd kick might do that a bit too.

On Tue, Mar 26, 2024 at 12:24 PM Mike Jarvis @.***> wrote:

The pupil plane u,v values are set in the atmospheric code. Those positions impact what happens with the diffraction and subsequent Rubin optics. I'm not sure where they get set (if they do get set at all) when the PSF is just a Gaussian. But it seems plausible to me at least that having realistic pupil plan positions would trigger the kinds of optics that lead to scattered light more often than whatever the Gaussian does with them.

Josh should weigh in on this, since he understands this code (much) better than I do.

— Reply to this email directly, view it on GitHub https://github.com/GalSim-developers/GalSim/issues/1280#issuecomment-2021293996, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA33HNO2Z6FORIA6ZF5HLBLY2HDP7AVCNFSM6AAAAABFETKG4WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRRGI4TGOJZGY . You are receiving this because you were mentioned.Message ID: @.***>

rmjarvis commented 5 months ago

Surprised enough that it might be a bug? Either here in the AtmosphericPSF code or in imsim.

GalSim-developers / GalSim

numerical issue for some realizations of atmosphere #1280