lsst / rubin_sim

Scheduler, survey strategy analysis, and other simulation tools for Rubin Observatory.
https://rubin-sim.lsst.io
GNU General Public License v3.0
41 stars 37 forks source link

Getting NaN's in solar magnitudes. #370

Closed rmjarvis closed 11 months ago

rmjarvis commented 1 year ago

In some cases, we are getting NaNs in sky_model spectra.

Here is a test script that reproduces the error, at least on USDF and other linux machines. (It seems to pass on Macs, at least in one case.)

import numpy as np
from rubin_sim import skybrightness
sky_model = skybrightness.SkyModel()

ra, dec = 60.49055482066166, -38.16437996317624
mjd = 60143.421569611106

sky_model.set_ra_dec_mjd(ra, dec, mjd, degrees=True)
wave, spec = sky_model.return_wave_spec()

print(spec[0])
assert not any(np.isnan(spec[0]))

I tracked down the issue to the calculation of solar_mag for B, G, and R bands. They come out as nan. I believe the root problem is this snippet in SED.resample_sed():

            if wavelen[0] > wavelen_grid[0] or wavelen[-1] < wavelen_grid[-1]:
                f = interpolate.interp1d(wavelen, flux, bounds_error=False, fill_value=numpy.NaN)
                flux_grid = f(wavelen_grid)
            else:
                flux_grid = numpy.interp(wavelen_grid, wavelen, flux)

Somehow wavelen_grid[-1] is coming out 1.e-10 larger than wavelen[-1], so the first branch is triggered. Then you instruct interp1d to fill the values outside the range with NaN (!!!). The result of this is then passed to np.trapz to compute the integral, which makes that NaN, and things propagate from there. Filling with NaN just seems like an error. I can't think why that would ever be the right thing to do here. You should presumably either fill with 0 or shrink the grid to match the input wavelen array or something like that. Putting NaNs into something you're planning to do real calculations with is asking for trouble.

cwwalter commented 1 year ago

Just as an informational point I personally do get the same error using weekly _39 on MacOS Intel. (And we all see it on USDF with latest weeklies).

yoachim commented 11 months ago

I've always been a little confused by the Sed resampling methods since they are doing interpolations where I would normally expect flux-conserving re-binning.

Anyway, I just put in a 1-line fix that should prevent using the problem if branch.

rhiannonlynne commented 11 months ago

If it's helpful in terms of background, there has been some discussion about this behavior in the past -- https://community.lsst.org/t/producing-lsst-mags-from-partial-sed/6918/2 but the short answer there was Nans were more obvious to people than numbers which just happened to drop to 0. However, I think the use of phot_utils functions in catalogs (which was a big part of the motivation toward the current behavior) is likely past, and we should shift from making things have ways to "try to be fast" and more "try to be more flexible".

rmjarvis commented 11 months ago

Given that your own code then calls trapz on the full result of this function, I don't think NaNs can ever be the correct thing to do here. For NaNs to be appropriate you are requiring downstream code to be NaN-aware and avoid using them.

cwwalter commented 11 months ago

I'm a bit confused, so was wondering: #375 was just merged. Does it really address this, or is our issue still to be fixed?

yoachim commented 11 months ago

375 should solve the issue of the sky model throwing NaNs (at least the example case works now). We'll probably do a deeper fix in phot_utils to not throw NaNs in general later.

cwwalter commented 11 months ago

Thanks. Can you make a new minor release so that this gets into the weeklies?

rhiannonlynne commented 11 months ago

Yes, I'll do that now.

rhiannonlynne commented 11 months ago

OPSIM-1087 for future work.

rhiannonlynne commented 11 months ago

But also -- https://github.com/lsst/rubin_sim/releases/tag/1.3.3 I'll mark this issue closed.