wtbarnes / fiasco

Python interface to the CHIANTI atomic database
http://fiasco.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
21 stars 16 forks source link

Speedup level populations calculation #26

Open wtbarnes opened 6 years ago

wtbarnes commented 6 years ago

The level populations calculation is extremely computationally expensive and scales poorly with number of available transitions. This means that level populations for ions with very large number of transitions and available energy levels (e.g. Fe IX, XI) take a very long time to compute.

The level populations calculation is essentially needed for everything so poor performance here impacts many other calculations. Maybe something like numba or cython could be used to speedup this calculation or it could be parallelized using something like dask.

wtbarnes commented 5 years ago

Replace linalg.solve with SVD as in the ionization equilibrium calculation. They are really the same equation.

An equivalent function is implemented in Dask so this could be potentially parallelized as well.

kdere commented 5 years ago

speeding things up would be very helpful. right now I am trying to get Version 9 of the database and ChiantiPy out. There are significant changes in the way autoionization states are handled

wtbarnes commented 5 years ago

After some profiling, it seems the main bottleneck here is computing the effective collision strengths from the scups data, specifically the descaling all of the collision strengths.

dstansby commented 4 years ago

Just to keep a record, here's the profile I've just run on level_populations:


Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   244                                               @needs_dataset('elvlc', 'wgfa', 'scups')
   245                                               @u.quantity_input
   246                                               @profile
   247                                               def level_populations(self,
   248                                                                     density: u.cm**(-3),
   249                                                                     include_protons=True) -> u.dimensionless_unscaled:
   250                                                   """
   251                                                   Compute energy level populations as a function of temperature and density
   252                                           
   253                                                   Parameters
   254                                                   ----------
   255                                                   density : `~astropy.units.Quantity`
   256                                                   include_protons : `bool`, optional
   257                                                       If True (default), include proton excitation and de-excitation rates
   258                                                   """
   259         1       2326.0   2326.0      0.0          level = self._elvlc['level']
   260         1       4734.0   4734.0      0.0          lower_level = self._scups['lower_level']
   261         1       3802.0   3802.0      0.0          upper_level = self._scups['upper_level']
   262         1        407.0    407.0      0.0          coeff_matrix = np.zeros(self.temperature.shape + (level.max(), level.max(),))/u.s
   263                                           
   264                                                   # Radiative decay out of current level
   265         3      10938.0   3646.0      0.0          coeff_matrix[:, level-1, level-1] -= vectorize_where_sum(
   266         2      18325.0   9162.5      0.0              self.transitions.upper_level, level, self.transitions.A.value) * self.transitions.A.unit
   267                                                   # Radiative decay into current level from upper levels
   268         2       8834.0   4417.0      0.0          coeff_matrix[:, self.transitions.lower_level-1, self.transitions.upper_level-1] += (
   269         1       5608.0   5608.0      0.0              self.transitions.A)
   270                                           
   271                                                   # Collisional--electrons
   272         1   13480735.0 13480735.0     17.2          ex_rate_e = self.electron_collision_excitation_rate()
   273         1   11718735.0 11718735.0     15.0          dex_rate_e = self.electron_collision_deexcitation_rate()
   274         3      17306.0   5768.7      0.0          ex_diagonal_e = vectorize_where_sum(
   275         2         15.0      7.5      0.0              lower_level, level, ex_rate_e.value.T, 0).T * ex_rate_e.unit
   276         3      30929.0  10309.7      0.0          dex_diagonal_e = vectorize_where_sum(
   277         2         14.0      7.0      0.0              upper_level, level, dex_rate_e.value.T, 0).T * dex_rate_e.unit
   278                                                   # Collisional--protons
   279         1       1401.0   1401.0      0.0          if include_protons and self._psplups is not None:
   280         1      14309.0  14309.0      0.0              lower_level_p = self._psplups['lower_level']
   281         1       6536.0   6536.0      0.0              upper_level_p = self._psplups['upper_level']
   282         1   52839740.0 52839740.0     67.5              pe_ratio = proton_electron_ratio(self.temperature, **self._dset_names)
   283         1        156.0    156.0      0.0              proton_density = np.outer(pe_ratio, density)[:, :, np.newaxis]
   284         1      19077.0  19077.0      0.0              ex_rate_p = self.proton_collision_excitation_rate()
   285         1      31654.0  31654.0      0.0              dex_rate_p = self.proton_collision_deexcitation_rate()
   286         3       5279.0   1759.7      0.0              ex_diagonal_p = vectorize_where_sum(
   287         2          7.0      3.5      0.0                  lower_level_p, level, ex_rate_p.value.T, 0).T * ex_rate_p.unit
   288         3       5977.0   1992.3      0.0              dex_diagonal_p = vectorize_where_sum(
   289         2          9.0      4.5      0.0                  upper_level_p, level, dex_rate_p.value.T, 0).T * dex_rate_p.unit
   290                                           
   291                                                   # Populate density dependent terms and solve matrix equation for each density value
   292         1         17.0     17.0      0.0          populations = np.zeros(self.temperature.shape + density.shape + (level.max(),))
   293         2         40.0     20.0      0.0          for i, d in enumerate(density):
   294         1        542.0    542.0      0.0              c_matrix = coeff_matrix.copy()
   295                                                       # Collisional excitation and de-excitation out of current state
   296         1        375.0    375.0      0.0              c_matrix[:, level-1, level-1] -= d*(ex_diagonal_e + dex_diagonal_e)
   297                                                       # De-excitation from upper states
   298         1       1169.0   1169.0      0.0              c_matrix[:, lower_level-1, upper_level-1] += d*dex_rate_e
   299                                                       # Excitation from lower states
   300         1       1392.0   1392.0      0.0              c_matrix[:, upper_level-1, lower_level-1] += d*ex_rate_e
   301                                                       # Same processes as above, but for protons
   302         1        966.0    966.0      0.0              if include_protons and self._psplups is not None:
   303         1         25.0     25.0      0.0                  d_p = proton_density[:, i, :]
   304         1        418.0    418.0      0.0                  c_matrix[:, level-1, level-1] -= d_p*(ex_diagonal_p + dex_diagonal_p)
   305         1        250.0    250.0      0.0                  c_matrix[:, lower_level_p-1, upper_level_p-1] += d_p * dex_rate_p
   306         1        524.0    524.0      0.0                  c_matrix[:, upper_level_p-1, lower_level_p-1] += d_p * ex_rate_p
   307                                                       # Invert matrix
   308         1      59286.0  59286.0      0.1              val, vec = np.linalg.eig(c_matrix.value)
   309                                                       # Eigenvectors with eigenvalues closest to zero are the solutions to the homogeneous
   310                                                       # system of linear equations
   311                                                       # NOTE: Sometimes eigenvalues may have complex component due to numerical stability.
   312                                                       # We will take only the real component as our rate matrix is purely real
   313         1         88.0     88.0      0.0              i_min = np.argmin(np.fabs(np.real(val)), axis=1)
   314         1        561.0    561.0      0.0              pop = np.take(np.real(vec), i_min, axis=2)[range(vec.shape[0]), :, range(vec.shape[0])]
   315                                                       # NOTE: The eigenvectors can only be determined up to a sign so we must enforce
   316                                                       # positivity
   317         1         18.0     18.0      0.0              np.fabs(pop, out=pop)
   318         1         40.0     40.0      0.0              np.divide(pop, pop.sum(axis=1)[:, np.newaxis], out=pop)
   319         1          7.0      7.0      0.0              populations[:, i, :] = pop
   320                                           
   321         1         59.0     59.0      0.0          return u.Quantity(populations)
wtbarnes commented 4 years ago

@dstansby which ion did you use when generating this profile?

wtbarnes commented 4 years ago

Currently, the matrix equation is solved using np.linalg.eig. We should look into using np.linalg.eigh as the matrix of terms is symmetric and Hermitian and should only have real eigenvalues. This should help performance concerns and simplify the code a bit.

dstansby commented 1 year ago

Another route worth investigating is seeing how sparse the matrices are, and if they're quite sparse using sparse matrix functions.