Proof of concept: Initialise all models as ThreeFlavor

JostMigenda commented 6 months ago

For demonstration purposes, following the discussion in #309.

The changes are fairly small and well contained.

For most models (presn and ccsn), we already had code to map the columns in the input files (usually NU_E, NU_E_BAR and NU_X, i.e. a “OnePointFiveFlavor scheme”) to the TwoFlavor scheme; I’ve simply updated these existing mappings. With this, all tests run by pytest -m 'not snowglobes' succeed.

Of course, the FlavorTransformations are currently hardcoded to the TwoFlavor scheme; so any code that uses these transformations (mainly via SupernovaModel.get_transformed_flux) cannot currently switch to ThreeFlavor scheme. This is already changing in #308; so for the purposes of this proof of concept, I’ve marked those as xfails.

jpkneller commented 6 months ago

Jost, some models distinguish NU_X and NU_X_BAR.

JostMigenda commented 6 months ago

@jpkneller: Good catch, thanks! Fixed now.

@Sheshuk:

I wouldn't call this compact:

Already for CCSN models you have to make changes in:

PinchedModel

GarchingArchiveModel

Kuroda_2020

Fornax_2019 (several places)

Fornax_2021

Fornax_2022

Yes, though that is a fundamental fact of SNEWPY’s current implementation: It is not possible to make such a change solely in the SupernovaModel abstract base class, because __init__ and get_initial_spectra both need to be overridden in child classes. (Changing get_flux solely in the base class is possible; but as discussed in #309, that would leave a situation where a user would receive a mix of ThreeFlavor and TwoFlavor fluxes from a single model; so to avoid those, we’d again need to modify the __init__ and/or get_initial_spectra functions of all these child classes.)

Of course it's mostly repeating code, with slight variations. And of course it's there because the Fornax_2019 code itself is quite large. But with these changes it becomes even larger.

The reason for the repetition in my Fornax_2019 changes is that the Fornax_2019 code itself already was repetitive there; and I wanted to keep the changes minimal and obvious for this proof of concept. In the new commit, I’ve made a slightly larger change so that the self._flavorkeys dict is only defined once; after these changes, Fornax_2019.__init__ is actually four lines shorter than before—and the total PR is five lines longer (of which four lines are xfails for the tests).

Regarding testing I agree that testing is important to ensure we can trust any changes we make. Right now, we already have tests that verify that our models can be initialised and that get_initial_spectra returns something in the expected format; those tests all succeed for this branch. ✅ We also have integration tests, which check the calculated event rates for a few scenarios. Unfortunately, these include calls to get_transformed_spectra; so they cannot run on our two draft PRs (#324 and this one). ⚠️

In principle, I think this combination of tests would normally be sufficient; in practice, while the integration tests don’t run, some replacement would be highly desirable. I’ve therefore spent the evening extending the existing initialisation tests to

initialise each model/progenitor with the snewpy v1.5 release
get_initial_spectra (i.e. TwoFlavor) and write those to file
initialise each model/progenitor with the branch from this PR here
run get_initial_spectra (with ThreeFlavor)
read the TwoFlavor file and compare numerical values of TwoFlavor.NUX with ThreeFlavor.NU{MU,TAU}, and of TwoFlavor.NU_XBAR with ThreeFlavor.NU{MU,TAU}_BAR

The result: In all cases the numerical values are identical. ✅

The test code is a bit hacky, unfortunately; but if you want, I can write it up and share it on Tuesday. (I’m on leave Fri/Mon.) But to be clear: This is intended as a one-off test. My suggestion is that we transform all models into ThreeFlavor at initialisation and don’t use TwoFlavor at all, so keeping a test for converting back to TwoFlavor would not make sense—just like we haven’t had any tests that convert models from TwoFlavor back to “OnePointFiveFlavor” to test that we’re adding the NU_X_BAR component correctly.

In summary

I agree that this is doable and not too hard. But what is the benefit of this approach? What do we gain? Does it make the code cleaner, easier to maintain or to test?

In my opinion, yes. Cleaner, because it would allow us to completely delete most TwoFlavor code, including special cases for handling NU_X(_BAR). (Maybe even all TwoFlavor code; but as you wrote earlier, maybe for some FlavorTransformations it’s easier to write them down in TwoFlavor and convert to ThreeFlavor with a matrix internally.) Easier to maintain, because we can be certain that after model initialisation everything is (at least) ThreeFlavor. If we leave the models TwoFlavor after initialisation and only transform to ThreeFlavor at a later point, we’d always have to think about whether a model might still be TwoFlavor in the part of the code we’re currently editing, and whether we should use NU_X or NU_MU/TAU … Easier to test, because—see above—we could delete a bunch of TwoFlavor-specific code; so that would simplify our testing, too.

SNEWS2 / snewpy

Proof of concept: Initialise all models as ThreeFlavor #335

I wouldn't call this compact:

In summary