SNEWS2 / snewpy

A Python package for working with supernova neutrinos
https://snewpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
26 stars 19 forks source link

Sheshuk/refactor fluence #152

Closed Sheshuk closed 2 years ago

Sheshuk commented 2 years ago

Making the generate_fluence and generate_time_series functions work faster:

Sheshuk commented 2 years ago

Performance comparison

Created using

%%timeit
generate_fluence('models/Bollig_2016/s11.2c','Bollig_2016','NoTransformation',d=10)

and

%%timeit
generate_time_series('models/Bollig_2016/s11.2c','Bollig_2016','NoTransformation',d=10)
branch fluence time_series
main 4630b45892952ff478e6e479a6444f2894862220 19.5 s ± 108 ms 5.67 s ± 24.9 ms
Sheshuk/refactor_fluence https://github.com/SNEWS2/snewpy/pull/152/commits/d9eebf637bf52bee89e32f5fd137c70e893bfa3b 1.54 s ± 19 ms 906 ms ± 7.7 ms
Sheshuk commented 2 years ago

Results comparison

I compared the files, generated at the previous step: fluence_new.txt fluence_main.txt

with script:

f0 = np.loadtxt('fluence_main.txt')
f1 = np.loadtxt('fluence_new.txt')
#compute relative diff
df = (f0-f1)/f0
#skip the first row (E=~0) and first column (energy)
df = df[1:,1:]
E = f0[1:,0]*1e3
#plot
plt.plot(E,df)
plt.xlabel('E, MeV')
plt.ylabel(f'$(\Phi_1-\Phi_0)/\Phi_0$')
plt.show()

image (Edit: lines correspond to different flavors) The relative difference is below 3e-4. It can be explained by two facts:

Overall I think that now the result should be slightly more accurate.

Sheshuk commented 2 years ago

@evanoconnor @jpkneller can someone of you, as people who authored the initial to_snowglobes routines, please cross-check my refactoring? This branch is based on #138, but it is a separate change, so I decided to keep it in a separate PR.

JostMigenda commented 2 years ago

I’ve started to look at this PR today, after the request at today’s telecon. A couple of comments:

Sheshuk commented 2 years ago

Thanks, Jost!

The Flux class appears to be designed to work with an arbitrary number of dimensions; but I’m not sure when we would use any dimensions other than flavour × energy × time? If we don’t actually use this, I wonder whether less general code might be easier to understand?

It's done for the cases where we integrate over energy or time, or select a specific flavour - that way we have the same Flux object. Having separate objects for flavor*time, flavor*energy, time*energy and flavor*time*energy would have a lot of repeating code.

I agree that general code is not very transparent, but to me it seems a lesser evil (as far as the user interface is clear) than having a zoo of classes doing the same thing. What do you think?

Also I will add unit tests to make sure all the functions work as intended.

I don’t think I feel comfortable reviewing the whole refactoring; Jim or Evan are probably better able to do that. However, I can see that there are a couple of elements to this PR that are independent of the main refactoring (e.g. binning in .glb file, letting t be a vector, moving some repeating code to functions)—maybe it would help to split those out into separate PRs?I’d be happy to review those quickly, so others have an easier time reviewing this PR!

I totally agree, and will try to split this PR.

JostMigenda commented 2 years ago

Completely agree that having a zoo of classes would be terrible!

On the other hand, having a single class that could have an arbitrary number of axes in an arbitrary order (or with an arbitrary string as name) has a few downsides:

Think about the problem very naïvely: What if a Flux were simply a 2D array (hard-coded to be time × energy)? Integrating over one dimension would still return a 2D array, which simply has length 1 along that axis; and a collection of Fluxes for different flavours would be a dict[Flavor, Flux]. We’d lose the extensibility to add extra axes for other quantities (though it sounds like we don’t expect to use others beyond flavor, time and energy, anyway), and may need a few extra square brackets to access elements of a Flux if we don’t squeeze() it down to 1D after integration; but I think we’d gain a lot of clarity (and could delete some of the helper functions that are needed to make the code very general). Unless there are any implementation issues I’m missing right now, I think that would be a fairly good trade-off to make. What do you think?

Sheshuk commented 2 years ago

Think about the problem very naïvely: What if a Flux were simply a 2D array (hard-coded to be time × energy)? Integrating over one dimension would still return a 2D array, which simply has length 1 along that axis; and a collection of Fluxes for different flavours would be a dict[Flavor, Flux].

This would have two additional downsides:

  1. You'll need to pass the values for the dimensions along with this 2D array - they're needed almost everywhere you use the flux. And in my solution they are packed inside the Flux object
  2. Using dict[Flavor, Flux] requires looping over the flavors again and again. While having flavor as an array dimension, like Flavor * time * Energy, it's easy and natural to apply the flavor transformations as a matrix operation. This would improve the

If you have specific concerns about clarity, please point me to the unclear places, so we can have a less general discussion.

If your concern is about how to test this general code, I plan to use hypothesis package, which does a great job of generating test suits given the general rule.

JostMigenda commented 2 years ago

Think about the problem very naïvely: What if a Flux were simply a 2D array (hard-coded to be time × energy)? Integrating over one dimension would still return a 2D array, which simply has length 1 along that axis; and a collection of Fluxes for different flavours would be a dict[Flavor, Flux].

This would have two additional downsides:

  1. You'll need to pass the values for the dimensions along with this 2D array - they're needed almost everywhere you use the flux. And in my solution they are packed inside the Flux object

Sorry, I should have worded that better: Flux would still be a class; but self.array would be guaranteed to be a 2D array. So we’d still get the same encapsulation; we’d just be able to cut most of the code that’s necessary to support an arbitrary number of dimensions.

  1. Using dict[Flavor, Flux] requires looping over the flavors again and again. While having flavor as an array dimension, like Flavor * time * Energy, it's easy and natural to apply the flavor transformations as a matrix operation. This would improve the
  • performance: vector operations instead of python looping,

It is a loop over 4 flavours while we’d still benefit from vector operations on the (potentially) thousands of entries in the array. So we should still get almost all the performance benefit. If merging the flavours into a single array were responsible for most of the performance improvement achieved in this PR then yes, I’d bite the bullet and accept it. But if we can get almost all the benefit without it, then I would prefer to sacrifice a little bit of performance for much clearer code.

  • flexibility: several flavor transformations can be applied sequentially

Wouldn’t it be more efficient to combine all transformations first, and then apply the resulting matrix to the data? Rather than apply each transformations to the large array individually.

Agree that a single line would be nicer; but having one line per flavour is still fairly short and very easy to understand.

If you have specific concerns about clarity, please point me to the unclear places, so we can have a less general discussion.

I have two concrete concerns:

  1. A flux, a fluence and a collection of fluxes in different flavours are physically very different things. Trying to fit them all into one class shifts the burden onto developers to keep in mind which of these things any given object actually is and what methods in the class can actually be applied to it.
  2. Supporting an arbitrary number of dimensions, when we plan to use only time and energy (and perhaps flavour—see 1.), makes the code more complex than it needs to be.

So, in practice I would suggest the following changes:

  1. Create a separate FluxCollection class that combines multiple Flux objects corresponding to different neutrino flavours.
    • Under the hood, this would be a dict[Flavor, Flux] (or perhaps some other data type that has better performance characteristics while keeping the conceptual separation between an individual flux and a collection of fluxes)
    • Move the to_snowglobes() and _save_to_snowglobes() functions from Flux to this new FluxCollection, since they are only applicable to objects that contain fluxes for all flavours
    • Optionally, move the few lines of code quoted above from base.py to a __matmul__() function in this class, so higher-level code can apply a FlavorTransformation in a single line as you suggest above.
  2. The array underlying the Flux class should be fixed to 2D (time × energy)
    • Treating a fluence as a flux with exactly one time bin is reasonably obvious on the physics side; so I think the mental overhead mentioned in issue 1 is negligible in this case.
    • Instead of having arbitrary strings as axis labels (and being unsure which of them actually exist in a given Flux object), we could have a simple two-element IntEnum. In addition to the usual advantages of enums over strings for this use case, that would also remove the need for the three get_axis* helper functions.
    • This would also remove the need for the Flux.squeeze() function.

If your concern is about how to test this general code, I plan to use hypothesis package, which does a great job of generating test suits given the general rule.

Oh, that looks very cool!

JostMigenda commented 2 years ago

Before we get too bogged down in details of the Flux class, I should add that the bulk of this PR looks really good to me!

Removing duplicate code from the get_time_series() and get_fluence() functions is really nice; and moving much of that (particularly the distance calculation) into a Flux class is conceptually cleaner. Letting t in get_initial_spectra() be a vector as well is both more consistent and a huge performance benefit, so very nice as well!

(It’s easy in these code discussions to focus on the bits we disagree with; so we should occasionally take time to highlight all the things we agree with and think are great! 🤝)

Sheshuk commented 2 years ago

I'm sorry to continue this discussion, but I can't agree with you on this.

Wouldn’t it be more efficient to combine all transformations first, and then apply the resulting matrix to the data? Rather than apply each transformations to the large array individually.

Nope, as you can see our flavor transformations are (in most general case) a function of the energy and time. So it means working with the same huge (flavor1 flavor2 energy * time ) arrays.

Agree that a single line would be nicer; but having one line per flavour is still fairly short and very easy to understand.

As physicists we use matrix multiplication to describe flavor transformation. I don't understand how treating the flavors separately would be simpler for understanding.

  1. A flux, a fluence and a collection of fluxes in different flavours are physically very different things

Okay, maybe a "Flux" is not the best name for this class. It's an N-dimensional data table with definite axes. The physical meaning is kept by self.array.unit, so it's easy to determine how it should be treated - integrated, multiplied etc. to get physical values.

  1. Supporting an arbitrary number of dimensions, when we plan to use only time and energy (and perhaps flavour—see 1.), makes the code more complex than it needs to be.

Thought about having additional dimensions: some models have angular dependency, progenitor mass and other additional parameters. In my solution all these parameters can be extra dimensions, so we can make fast vectorized calculations, instead of having several python loops, if one wants to scan these parameters. Also the flavor transformations will work on arbitrary array shape, because we can easily determine the "Flavor" axis, and all the rest dimensions are treated automatically.

I agree that the class interface can be improved (I'll get rid of 2 of 3 get_axis_* methods). And I agree that it's important to keep simple, but you're talking about dict[Flavor, array] vs array simplicity - and I can't agree to that. If we stick to using python loops, we'll eventually have to refactor everything again, to optimize loops over other parameters - as I did here for the time.

jpkneller commented 2 years ago

Sorry I'm late to the discussion - I had a couple of important deadlines this week.

In hindsight breaking the equations up by neutrino flavor was a mistake and it would have been better to keep the flavor as an array dimension. As the flavor transformations become more complex (3 flavour, time and energy dependent numerical calculations) , I think we will find that breaking it up by flavor really hurts the performance because oftentimes all the transition probabilities can be computed simultaneously. The solution we are considering is to add a get_probabilities method to the flavor_transformation classes where we return a matrix that converts the array of initial flavor spectra into the array of flavor spectra at the detector. The 1/(4 pi d^2) is not included. We'll keep the prob_ee etc for backwards comparability for now but all those will do is pick out elements from the matrix.

But I don't see a need to go beyond the three dimensions of flavor x energy x time. It is true a user may be looping over models but I don't like having a model number as an array dimension because it's meaningless. Finally, I'm wary of making the SNEWPY learning curve too steep if we make things too general. My take has always been that Python is useful because it's easy to do simple things quickly, not because it's efficient. If I want efficient code then I write it in C.

JostMigenda commented 2 years ago

If we stick to using python loops, we'll eventually have to refactor everything again, to optimize loops over other parameters - as I did here for the time.

The problem with this argument is that time is fundamentally different from other parameters: There are potentially thousands of time bins in a given model (12,380 for the Bollig_2016 s11.2 progenitor), but there are only ever four flavours (six, if we distinguish mu and tau at some point in the future).

The performance comparison in your post above are impressive (6–13× improvement). However, I did some more detailed benchmarking on generate_time_series('models/Bollig_2016/s11.2c', 'Bollig_2016', 'NoTransformation', d=10) and here’s how the remaining run time (after your changes) is spent:

Total 2.2 s
setup 1.9 s
get_transformed_flux 0.005 s
write tarball 0.3 s

Almost 90% of the remaining running time is spent on setup (reading model data from the files, etc.), the remaining ~10% are spent writing to the tarball and the actual flux calculations (including the non-vectorised calculation of flavour components in base.py which you cite above) make up less than 0.3% of the total running time. Of course, exact numbers may vary for different models, etc.; but even if it’s closer to 1%, it looks like we will not gain any noticeable benefit whatsoever from vectorising over the flavours (or any other optimisation of the flux calculations for additional parameters).

Sheshuk commented 2 years ago

Sorry, I decided to step away from this task for a while to give it some more thought and to stay constructive. However I don't see any solution.

@JostMigenda you consider one my argument, while completely ignoring all the rest.

  1. The benchmark you made is very nice, however is quite synthetic - it considers no flavor transformation. And in this simple case we really have very small benefit from vectorizing, but as I said, in case of more complicated flavour transformations we get multiplication of many ndarrays, and thousands of prob_ee calls. I don't see why on Earth this shouldn't be done via matrix multiplication. I don't understand why do you need to compare with setup and output time. These are separate routes, and the workflow can be changed (i.e. if user needs to do some calculations with the transformed flux, no need to write it and read again). This benchmark basically shows that our setup and output routines could be better.

  2. I don't see any benefit at all in sticking to the dicts instead of building special class. Maybe using the built-in class gives some psychological comfort, but it smears the semantics of the code: a dict is much more general purpose object, than any Flux (or whatever name is more appropriate) class with exact physical meaning and operations. Again, if you doubt the correctness of my implementation, that's what the tests are for.

  3. Talking about complexity to user: in the current setup, the details of how the flux is represented are hidden from the user, if he sticks to our generate_fluence methods etc. My refactoring here just makes things more decoupled in the internal implementations, while we can have any ridiculously simple user interface we want. So I believe that my changes don't make SNEWPY learning curve steeper.

  4. I am grateful for the thorough feedback. But I don't think I can degrade the solution to fixed dimensions and dicts, as you suggest. Since this seems to be a really blocking point, I suggest to close this PR (or passed to someone who can do things better)

JostMigenda commented 2 years ago

Thank you for the update, @Sheshuk. I’m sorry that this discussion has not made much progress and I understand your frustration. Perhaps it is better to pause this for now.

As noted earlier, there were significant changes in this PR that we all agree are good and useful; so we should at least include those. If you would prefer not to work on this for a while, I can put that on my to-do list. (Advance warning: I won’t be able to do much work on SNEWPY for the next several weeks, so probably wouldn’t get to this before March/April. If anyone else wants to look into this in the meantime, please go ahead!)


Regarding the Flux class itself, I would suggest tabling that for now and revisiting it once those more complex flavour transformations Jim mentioned are implemented and we can benchmark things and see where optimizations are needed. If the timeline works out, we could schedule some time at the hackathon this summer for this—perhaps discussing it in person (or at least on Zoom) is a more promising way to resolve this than in writing?