Open jpkneller opened 3 months ago
(edited first post, just adding the syntax highlight)
Thanks for bringing it up! I wanted to separate the Flavor changes from #308 and to have a discussion on separate PR, but discussing it in an issue is more productive.
My thoughts on the suggested solution:
I don't think that the problem with flavors equality is what we need to solve. Most probably the user wouldn't compare the flavors directly, but provide flavors as argument to access the desired flavor, like flux[Flavor.NU_E]
- so maybe all checks can be done there, keeping the Flavor schemes cleaner.
Conversion of NU_X
to be NU_MU+NU_TAU
etc can be done via conversion matrices, allowing to easily convert object of one flavor scheme to another.
As the flavor classes are used practically everywhere throughout the code, I suggest we summarize their desired behavior, before we discuss actual implementation.
Now we have several Flavor schemes. Whether they're Enums, or subranges or a single Enum, or general classes.
flavor
attribute, which lists the available flavors)I'm not sure if a every FlavorTransformation
needs to know it (trivial example - NoTransformation
class).
Also I don't think the SupernovaModel
should keep it's output flavor scheme - it can just write it in the calculated flux.
These Flavor schemes are to be accessed by user in two main cases:
for flavor in flux:
plot_flavor(flux.energy, flux[flavor], label=flavor.to_tex())
prob_em = prob_matrix[Flavor.NU_E, Flavor.NU_MU]
flux_nux = flux[Flavor.NU_X]
and
flux_2f = conversion_2f3f @ flux_3f #using an external conversion matrix
flux_oscillated = transformation_matrix @ flux_unoscillated
We don't have any problems with using the wrong flavor scheme as long as it is stored in the object containers (and not just a global for flavor in Flavor
).
❗ Requirement:
container class must provide an __iter__
method which returns its flavor list.
Here we want to avoid accessing the wrong flavor. So if user provides a wrong flavor we can either
❗ Requirement:
container class makes a check/conversion inside __getitem__
method
We need to check if the flavor schemes of the objects are compatible - similar as above.
❗ Requirement:
container class makes a check/conversion inside used method (like TransformationMatrix.__matmul__
method in this example)
I'm leaning towards making the conversion in limited circumstances when the flavor is being used as a key and the numerical value is immaterial. The circumstance where the conversion should be made is A) when there are equivalent keys in the different schemes (e.g. NU_E_BAR since it is common to all of them) and B) when the conversion is unambiguous e.g. the user passed in the flavor NU_MU from the ThreeFlavor scheme but the container uses TwoFlavor so the conversion is NU_MU -> NU_X. I haven't thought about whether this solution means a lot of new code but if it does, raise the exception.
Before I look at the proposed implementation in #324, I wanted to write up my thinking at a very high level.
The key feature of snewpy is providing access to lots of different supernova models (and flavor transformations) with a single consistent interface:
Now, whether any particular simulation distinguished between NU_MU and NU_TAU or lumped them both together as NU_X is also an implementation detail. Users should not have to worry about that; and anytime some user code works for a TwoFlavor scheme but not for a ThreeFlavor scheme (or vice versa), I would consider that to be a serious bug.
In contrast, whether or not a simulation (or flavour transformation) includes sterile neutrinos is an actual physics difference; it is not an implementation detail. Ideally, we should still try to find a solution that ensures that code written for Two/ThreeFlavor schemes also works for FourFlavor schemes (and vice versa); but if that is not possible in some edge cases (or only with significant tradeoffs), I would be open to having some API differences here.
I mostly agree about the final user perspective.
Except for this statement:
anytime some user code works for a TwoFlavor scheme but not for a ThreeFlavor scheme (or vice versa), I would consider that to be a serious bug
I think that's too extreme.
In some cases the user expects to have a TwoFlavor
case, and writes code specifically for that. I agree that this should not be dictated by the implementation of the model though. So I think the best solution is to allow user to be declarative - if you want to work in ThreeFlavor basis, you explicitly convert the flux to that basis, and don't care about any implementation details.
In my solution in https://github.com/SNEWS2/snewpy/pull/324 it is as simple as:
flux = model.get_flux(...)>>ThreeFlavor
#now we know flux has 3 flavors
Note that it is somewhat similar to using astropy.units
: when you get, say, an energy, no need to check its unit, just request to convert it:
energy = get_energy(...)<<u.GeV
#now we know energy is in GeV
Also there is another side: the implementation of flavor schemes should make our lives as developers easy - by making the code flexible and easier to extend.
So in summary I think:
model.get_flux
output the flux in ThreeFlavor
by default - but the scheme will depend on the applied FlavorTransformation
anyway, leading to steriles[Note: I’m deliberately focussing on TwoFlavor vs. ThreeFlavor here. If models/transformations involve sterile neutrinos, that is a clear physical difference and I think it’s reasonable to require the user to explicitly keep track of that. The conversion matrices look like a good approach in that case. My point here is about the most common case, without steriles, and the best user interface for that.]
The TwoFlavor case an approximation that’s useful to save computing resources when running SN simulations; but it’s not physically meaningful. And of course different papers/codes have different conventions on whether NU_X means (a) NU_MU or NU_TAU, (b) the sum of both, or (c) the sum of those two and their respective antineutrinos. That alone clearly indicates to me that we all think in three flavours by default (and then map our thinking onto two flavours only if forced to).
SNOwGLoBES already enforces a ThreeFlavor scheme, with separate interaction channels (and associated cross section, etc. files) for NU_MU and NU_TAU.
In some cases the user expects to have a
TwoFlavor
case, and writes code specifically for that.
Users may “expect” a TwoFlavor case because that was the convention in some particular model paper, sure. But in that same sense, they might “expect” a “OnePointFiveFlavor” scheme (NU_E, NU_E_BAR and NU_X only), too. Or they might expect NU_X to mean either (b) or (c), while we use (a). So we will almost invariably break some initial user expectation.
The best thing we can do to help users recover from breaking these initial expectations is to provide a consistent interface. And as explained above, the ThreeFlavor scheme
If the TwoFlavor case is all we have, that’s one thing; but once we have a ThreeFlavor case, I see no reason why users would explicitly want to go back to the TwoFlavor case; just like there was no one who ever requested to go back to the OnePointFiveFlavor case.
I think there are several good reasons not to support the TwoFlavor case:
The TwoFlavor case is a very common source of confusion or bugs:
FlavorTransformation.ipynb
, I get questions about why transformations look like they don’t conserve neutrino number. It’s exactly because of this weighting factor.If we can convert all existing models to ThreeFlavor during initialisation (see “Note on implementation” below for how to do that), then we would not have to handle the 2to3 (or 3to2) case in conversion matrices. The code to generate conversion matrices to/from sterile schemes could become much simpler
“There should be one-- and preferably only one --obvious way to do it.” — If we give users the choice to convert back to a TwoFlavor scheme, they will have some extra cognitive load of having to decide whether to use that. Without a clear answer of when to use what option, choice can be harmful.
I'm strongly opposed to change the code in the existing models to use three flavors scheme, as it is a lot of changes to the code that already works and is tested. Let's use conversion matrices instead.
I agree with your point about wanting to minimize changes to the existing code; but am drawing a different conclusion from that: Generally, a best practice when using external data is to clean it up immediately after ingesting it, so all code that needs to understand details of the data format is in one place, and all downstream code is isolated from those details.
My preference would therefore be to convert to ThreeFlavours in the Model.__init__
function, right as we read data from the file. We would only need to change the mapping of which flavour corresponds to which column in the input file; all the code to read from the file and calculate the desired quantities would be unaffected. (This is also where we already did the OnePointFiveFlavor to TwoFlavor conversion.)
By applying this as soon as we read from the file, we only need to change one location and are guaranteed to have consistent data throughout our code. In contrast, applying the conversion matrix would have to happen at a later stage (get_flux
, or maybe get_initial_spectra
); but then e.g. self.luminosity
(which is set in __init__
) would remain hardcoded to be TwoFlavor.
In general I agree that the TwoFlavor scheme is misleading, and should be discouraged, at least in the usage examples. However imposing a restriction means less freedom also for the developer side.
My preference would therefore be to convert to ThreeFlavours in the
Model.__init__
function
The inner data representation is different for each model (loader), so that means writing different conversion code in each of those. While this can be done, I think this is a waste of time and effort.
We have code that is working and tested, and we can operate on its result, to get the desired behavior: like make conversion in get_flux
method before returning the result.
It's the same for the final user, but we use standard conversion matrices on the same objects (flux.Containers) instead of implementing them for each model on whatever inner representation they have.
While most of them can and should be switched to ThreeFlavor, as in https://github.com/SNEWS2/snewpy/pull/308, in some cases it can be more natural to use TwoFlavor (I'm no expert in this though). I think having the freedom to define the flavor transformation in its easiest form and then just multiply by conversion matrix is much better than restrict the definition to ThreeFlavor case, and write large matrices by hand.
Right now we have the freedom to define any kind of neutrino basis, and conversion matrices, which might become useful to write implementation pretty close to the formulas in the papers, which means much easier implementation and cross-check.
Sorry for joining the discussion late, I've been out of the office the past couple of days.
"Generally, a best practice when using external data is to clean it up immediately after ingesting it, so all code that needs to understand details of the data format is in one place, and all downstream code is isolated from those details. My preference would therefore be to convert to ThreeFlavours in the Model.init function"
This is my preference also and I did that in #308 for the supernova models which inherit from PinchedModel. The init function for each model (e.g. Nakazato_2013) loads the data then the parent class (PinchedModel) checks whether turns the input data into three flavors. I split it this way to avoid duplicating the code to convert to three flavors. However I was lazy and didn't do the TwoFlavor>>ThreeFlavor conversion for either the Fornax models or the pre-supernova models so these require TwoFlavor >> ThreeFlavor conversion code in their get_initial_spectra functions. This can be easily fixed by copying the relevant block from PinchedModel. I would prefer not to add a method to SupernovaModel so that it remains an ABC.
I dislike TwoFlavor schemes for exactly the reasons indicated, you always have to check exactly what is meant by it. I don't have a strong preference for >> over << but I am reluctant to include the conversion from TwoFlavor to ThreeFlavor because, it is not unambiguous what this means: "In the face of ambiguity, refuse the temptation to guess." My preference is that if the user wants to do this conversion, they should specify it themselves. Even ThreeFlavor>>FourFlavor makes me shiver. The only conversion matrix we require for SNEWPY2 is FourFlavor>>ThreeFlavor which I buried inside neutrino.py and called Remove_Steriles.
We have code that is working and tested, and we can operate on its result, to get the desired behavior: like make conversion in
get_flux
method before returning the result.
Doing the conversion only in get_flux
is not sufficient; then users would still get the old format from get_initial_spectra
and get_transformed_spectra
and self.luminosity
. But those three don’t contain instances of the NeutrinoFlux
class, so we wouldn’t be able to use the conversion matrices as is.
I’m currently working on a demo showing how the changes to the __init__
s would look like. I’ll be on annual leave Fri and Mon; I hope I can get this ready before then (if I don’t get pulled aside by that medical imaging project I work on as well).
Even ThreeFlavor>>FourFlavor makes me shiver.
I don’t have an issue with this: Every ThreeFlavor flux implicitly has zero sterile neutrinos; so explicitly adding this zeroed-out flux component feels very straightforward and intuitive to me. In this case, there are no conflicting conventions causing confusion.
Yes, we need to add the TwoFlavor>>ThreeFlavor conversion in the init functions for Fornax and preSN models. As I said, I was lazy (the init functions for them are big and messy).
I think this is rather extreme.
This can be easily fixed by copying the relevant block from PinchedModel
IMO copying code is bad in general - it makes the software really hard to maintain and change. And classes are good - they allow to put the logic in a single place, and reuse it everywhere. This is exactly what the conversion matrices do.
Copying code N times might be an easy fix, but it makes N times more probable for mistakes (and N times more time to check everything each time you need to touch it).
Even ThreeFlavor>>FourFlavor makes me shiver
If the model didn't have sterile neutrinos from the start, it will provide a ThreeFlavor output, and then you'll have to convert it to ThreeFlavor>>FourFlavor if you add the steriles in the FlavorTransformation. This requires conversion (as trivial as it is) - because it moves our arrays to different flavor basis. Do you suggest to manually add missing rows and columns instead of using matrix with defined and tested behaviour?
Also if we add more sterile flavors - will we, again, do more conversions by hand?
Doing the conversion only in
get_flux
is not sufficient; then users would still get the old format fromget_initial_spectra
andget_transformed_spectra
andself.luminosity
. But those three don’t contain instances of theNeutrinoFlux
class, so we wouldn’t be able to use the conversion matrices as is.
self.luminosity
is not part of SupernovaModel
interface, so you shouldn't be using this in user code.
As for get_initial_spectra
and get_transformed_spectra
- maybe then it's a good case to switch to using flux.Container
s there as well? I didn't switch it there for backward compatibility, but now we're talking about v2.0
self.luminosity
is used in the demo notebooks for (almost?) all models; so if people shouldn’t be using that, we’ve done a very bad job communicating that … 😉
More importantly: #335 is the promised proof-of-concept PR to show how the __init__
s would change. It looks fairly clean to me; and thanks to the PinchedModel
base class, there is fairly little need for duplicating code.
self.luminosity
is used in the demo notebooks for (almost?) all models; so if people shouldn’t be using that, we’ve done a very bad job communicating that … 😉
It is defined only in the PinchedModel. Most of our models inherit from it, but not all. If it is part of the user interface, it should have been part of the SupernovaModel class.
This is a problem with our interface in general: there is no definition of the parts which are exposed to the user, and that makes our requirements of "backward compatibility" especially painful.
self.luminosity should not be part of the SupernovaModel class but I'm okay with it being part of PinchedModel.
"This is a problem with our interface in general: there is no definition of the parts which are exposed to the user, and that makes our requirements of "backward compatibility" especially painful."
There's not much we can do about this - each model has different data. We can make it easier to find out what data from the model is available to the user.
self.luminosity
is actually available in all three Fornax models as well; so it is available for all ccsn models. Given that, and our usage in all demo notebooks, I think it’s effectively official.
(But I agree—sentences like “I think it’s effectively official” indicate that we messed up and haven’t been as clear as we should’ve.)
@jpkneller
self.luminosity should not be part of the SupernovaModel class
Why?
There's not much we can do about this - each model has different data. We can make it easier to find out what data from the model is available to the user.
We can and should define the user interface only in the base class. That's what the abstract base classes are for: describe the stuff which must be implemented, and anything else is implementation detail and might change from model to model.
This way:
1) User knows exactly what to expect from the model, without having to check the implementation of individual models.
2) Developer who adds models knows exactly what functionality needs to be implemented.
When I added the presupernova models I had no idea about the luminosity
, because I am inheriting the SupernovaModel
base class, and consider that everything I need is there. Why should I look into impolementation of PinchedModel
or Fornax
for that?
@JostMigenda
self.luminosity
is actually available in all three Fornax models as well; so it is available for all ccsn models. Given that, and our usage in all demo notebooks, I think it’s effectively official.
It doesn't make sense. If it's part of the interface, it should be in the base class - and tests. If it's not in the base class, but you manually add it in the same way in all other classes - it's the implementation detail, which should not be used in the user code.
The problem is that the "official" interface is defined in your head, and not some documented place, so when I propose a solution, you reject it because it contradicts with something you have in your mind.
We are a distributed team, and in order to work effectively we should try to communicate and define the requirements. Otherwise it becomes a mess.
We need to define the interface in base class and tests. Examples do not count: I don't have to check all the examples when developing, it's what the tests are for.
I am OK with either adding luminosity to base class, or removing it from the examples. I am not OK with considering this a normal situation, it must be changed.
@Sheshuk Honestly, until I double-checked yesterday evening, I didn’t realize either that self.luminosity
was consistent across all models. So this is simply a case of snewpy having grown organically without us ever sitting down to discuss something; there was no deliberate obfuscation. (Though I understand that it’s frustrating either way.)
I agree with you that the interface should be obvious from the base class. I think having the luminosity available easily is useful; so I’d err on the side of adding it to the base class.
(Maybe it’s worth writing a default implementation based on get_initial_spectra
in the base class, and then allowing child classes to overwrite it for e.g. PinchedModel
, where it’s more efficient to get the luminosity directly from the file? But that’s really not necessary now and can wait until we need to add it for a future model that doesn’t have a more efficient way to get the luminosity.)
"self.luminosity should not be part of the SupernovaModel class" "Developer who adds models knows exactly what functionality needs to be implemented."
It is not required that a model have a data member called luminosity and there will be cases where a model does not e.g. if the spectra are given by an analytic formula. If I remember correctly, there was a version of AnalyticModel which worked this way. The only thing required of a model class is that it has a function to return an initial and a transformed neutrino spectrum given a time, energy, and flavor prescription. Requiring a model have a data member called luminosity eliminates regions of model space. Even the requirement of get_initialspectra is too much because we have models for Type Ia and Pair-Instability supernovae where the flavor transformation has already been applied and there is no initial spectra to transform. I have already-transformed models like this for core-collapse too. This makes them harder to use with the rest of SNEWPY and we have to hack the generate* functions to work with them.
self.luminosity is used in the notebooks which allow users to look at the data in the models - slightly manipulated if we have transformed it from 1.5 or two flavors to three flavors. Thus it appears the only reason to require self.luminosity be a data member is because SNEWPY has notebooks that plot it. In principle, each model provides different data - that is why SNEWPY is so useful because it doesn't force groups to provide specific data in a specific format. Luminosity is the only data that is common to most (all?) models so it's the only one that even possibly be specified in SupernovaModel - but, again, if we do this then we eliminate parts of the space of models. Which other data elements are provided by a model have greater variance. What do we do? Specify all the data elements for a model together in the specific model, or split one of them off? I understand specifying the data members in the specific model class makes it more difficult for a user to know what data is available but I am very disinclined to force all models into the same shape - and it may not even be possible.
I've been thinking about the problem Andrey identified with the new TwoFlavor, ThreeFlavor and FourFalvor classes we want to introduce in version 2.0. The issues is with comparing `flavors' across the cases with different numbers i.e. how do you compare NU_E_BAR from TwoFlavor with NU_E_BAR from ThreeFlavor or compare NU_X from TwoFlavor with NU_MU from FourFlavor, and return True in both cases even though the numerical values of the enumerations can be different. The solution I came up with is to overload the comparison operators for the *Flavor classes. To reduce the size of the code I re-assigned the numerical values and added a couple of extra short methods. Overloading the comparison operators allows a user to compare flavors from different schemes but it doesn't solve the problem that the numerical values are different. What do people think? Is this is a problem we need to fix? Anyway, the solution is below. Now this code returns True for all cases
Here's the relevant parts of neutrino.py