cosimoNigro commented 2 years ago

I think that some people are using our tutorial notebook for fitting with the Gammapy wrapper. https://agnpy.readthedocs.io/en/latest/tutorials/ssc_gammapy_fit.html

I don't think it is particularly pretty, especially because of the large cell of code defining the wrapper. There was some discussion in one of the developer calls of Gammapy if we were to include the wrapper in agnpy, or in Gammapy, as currently done for the NaimaSpectralModel.

I think it is ok to have the wrapper within agnpy, we will be introducing Gammapy as an additional dependency but I think it's also fine. Now we will have a centralised wrapper, and hopefully adding a line of code we can solve the issue encountered by @guillaumegrolleron and @jepcor97 when fitting and simulating with the latest Gammapy versions (see the comments in the wrappers).

I thought about having a single wrapper per each radiative processes but this implies:

re-writing the same code several times (one per each radiative processes);
users having to link all the different parameters between the two (or more) radiative processes they are considering and then define yet another Gammapy.SpectralModel summing them.

So I implemented a wrapper per each scenario / source type, I have written:

one with synchrotron + SSC for BL Lacs;
one with synchrotron + SSC + external Compton (on BLR and DT) for FSRQs (the EC on BLR computation can be switched off as it is time consuming).

Parameters are read from an instance of the emission region and target.

Many things are missing, still I wanted to open the review to start to discuss the design and the organisation of the wrapper. @jsitarek what do you think? Let me know what you think about the strategy, the naming scheme, the position of the new submodule.

I will also be happy to have the opinions of @QRemy and @atreyeeS, from the Gammapy side - if they have time.

Follows a script to cross-check the normal computation against the new wrappers implemented (we can incorporate this in the tests).

import numpy as np
import astropy.units as u
from astropy.coordinates import Distance
from agnpy.emission_regions import Blob
from agnpy.targets import SphericalShellBLR, RingDustTorus
from agnpy.synchrotron import Synchrotron
from agnpy.compton import SynchrotronSelfCompton, ExternalCompton
from agnpy.wrappers import SpectralModelSSC, SpectralModelEC
from agnpy.utils.plot import sed_x_label, sed_y_label
import matplotlib.pyplot as plt

# emission region and targets
# - SSC blob
spectrum_norm = 1e48 * u.Unit("erg")
spectrum_dict = {
    "type": "PowerLaw",
    "parameters": {
        "p": 2.8,
        "gamma_min": 1e2,
        "gamma_max": 1e7
    }
}
R_b = 1e16 * u.cm
B = 1 * u.G
z = Distance(1e27, unit=u.cm).z
delta_D = 10
Gamma = 10
ssc_blob = Blob(R_b, z, delta_D, Gamma, B, spectrum_norm, spectrum_dict)

# - EC blob
spectrum_norm = 6e42 * u.erg
parameters = {
    "p1": 2.0,
    "p2": 3.5,
    "gamma_b": 1e4,
    "gamma_min": 20,
    "gamma_max": 5e7,
}
spectrum_dict = {"type": "BrokenPowerLaw", "parameters": parameters}
R_b = 1e16 * u.cm
B = 0.56 * u.G
z = 1
delta_D = 40
Gamma = 40
ec_blob = Blob(R_b, z, delta_D, Gamma, B, spectrum_norm, spectrum_dict)

# - BLR definition
xi_line = 0.024
L_disk = 1e46 * u.Unit("erg s-1")
R_line = 1e17 * u.cm
blr = SphericalShellBLR(L_disk, xi_line, "Lyalpha", R_line)
# - dust torus definition
T_dt = 1e3 * u.K
csi_dt = 0.1
dt = RingDustTorus(L_disk, csi_dt, T_dt)

# distance
r = 1e16 * u.cm

# agnpy radiative processes
synch = Synchrotron(ssc_blob)
ssc = SynchrotronSelfCompton(ssc_blob)

synch_ec = Synchrotron(ec_blob)
ssc_ec = SynchrotronSelfCompton(ec_blob)
ec_blr = ExternalCompton(ec_blob, blr, r)
ec_dt = ExternalCompton(ec_blob, dt, r)

# radiative processes wrapped by Gammapy
ssc_model = SpectralModelSSC(ssc_blob)
ec_model = SpectralModelEC(ec_blob, blr, dt, r, ec_blr=False)

# SED computation
nu = np.logspace(9, 29, 100) * u.Hz
E = nu.to("eV", equivalencies=u.spectral())

# SSC model
sed_ssc_agnpy = synch.sed_flux(nu) + ssc.sed_flux(nu)
sed_ssc_gammapy = (E**2 * ssc_model(E)).to("erg cm-2 s-1")

# EC model
sed_ec_agnpy = (
    synch_ec.sed_flux(nu) + ssc_ec.sed_flux(nu) + ec_dt.sed_flux(nu)
)
sed_ec_gammapy = (E**2 * ec_model(E)).to("erg cm-2 s-1")

# SSC comparison
plt.loglog(nu, sed_ssc_agnpy, ls="-", label="SSC, agnpy")
plt.loglog(nu, sed_ssc_gammapy, ls="--", label="SSC, Gammapy wrapper")
plt.xlabel(sed_x_label)
plt.ylabel(sed_y_label)
plt.ylim([1e-13, 1e-9])
plt.legend()
plt.show()

# EC comparison
plt.loglog(nu, sed_ec_agnpy, ls="-", label="EC, agnpy")
plt.loglog(nu, sed_ec_gammapy, ls="--", label="EC, Gammapy wrapper")
plt.xlabel(sed_x_label)
plt.ylabel(sed_y_label)
plt.ylim([1e-18, 1e-14])
plt.legend()
plt.show()

Figure_1 Figure_2

AtreyeeS commented 2 years ago

This should be quite useful @cosimoNigro . One general comment here... Currently, gammapy FluxPointEstimator needs either an amplitude or norm parameter to be defined on the model. (I imagine that it would be interesting to fit reduced datasets directly with physical models and extract flux points). This is not the case here.

It is not a blocker, as the user can still multiply SpectralModelSSC by a normed model to the specific case of extracting flux points, but it is probably important to be correctly conveyed

AtreyeeS commented 2 years ago

Sorry my previous comment was not totally correct. As per the latest developments, users need to specify is_norm for scalable parameters (see discussions in https://github.com/gammapy/gammapy/issues/3690#issuecomment-1116218619)

So one option (to simplify things for the users) here maybe be to add a norm parameter which must be kept frozen, but can be used internally during flux point estimation...

codecov[bot] commented 2 years ago

Codecov Report

Merging #119 (d9e3a79) into master (86a28d4) will increase coverage by 0.43%. The diff coverage is 98.70%.

@@            Coverage Diff             @@
##           master     #119      +/-   ##
==========================================
+ Coverage   96.99%   97.43%   +0.43%     
==========================================
  Files          30       38       +8     
  Lines        2233     2997     +764     
==========================================
+ Hits         2166     2920     +754     
- Misses         67       77      +10

Flag	Coverage Δ
unittests	`97.43% <98.70%> (+0.43%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
setup.py	`0.00% <ø> (ø)`
agnpy/fit/models.py	`87.50% <87.50%> (ø)`
agnpy/fit/gammapy_wrapper.py	`95.39% <95.39%> (ø)`
agnpy/tests/test_wrappers.py	`99.20% <99.20%> (ø)`
agnpy/__init__.py	`100.00% <100.00%> (ø)`
agnpy/emission_regions/blob.py	`94.28% <100.00%> (+0.05%)`	:arrow_up:
agnpy/fit/__init__.py	`100.00% <100.00%> (ø)`
agnpy/fit/core.py	`100.00% <100.00%> (ø)`
agnpy/fit/data.py	`100.00% <100.00%> (ø)`
agnpy/fit/sherpa_wrapper.py	`100.00% <100.00%> (ø)`
... and 4 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

cosimoNigro commented 2 years ago

Thanks for your comments @AtreyeeS, now k_e, the normalisation of the electron distribution has is_norm=True when converted to a gammapy.modeling.Parameter. The flux or SED should scale linearly with the normalisation of the radiating electrons, as it is just a multiplicative factor in all the SED calculations.

cosimoNigro commented 2 years ago

Will add some tests later.

AtreyeeS commented 2 years ago

The flux or SED should scale linearly with the normalisation of the radiating electrons, as it is just a multiplicative factor in all the SED calculations.

I think that while the synchrotron and EC flux goes as k_e, the SSC goes as k_e^2 (because its the same population of electrons, so one factor of k_e from the electrons, and one from the sync photons). You can see the same in the fig below:

Unknown

cosimoNigro commented 2 years ago

Yes, you are totally right! Thank you! I will then introduce a general amplitude multiplying the SED and that would be the norm of the model. Would that be ok?

AtreyeeS commented 2 years ago

@cosimoNigro : In gammapy, we tend to use amplitude as a dimensional parameter, and norm as a scaling, so maybe norm is better? And document it well that the norm must be frozen during the fit. I don't know if it is possible to hide it from the users in any way, but if you are separating parameters as blob_parameters and jet_parameters as @QRemy suggested, then at least the norm should not show up in either of those

cosimoNigro commented 2 years ago

Ok, so I added the norm, and I put it a the bottom of the parameter list. It does appear if you call all the parameters together (don't know how to hide it).

from agnpy.emission_regions import Blob
from agnpy.wrappers import SynchrotronSelfComptonSpectralModel

blob = Blob()
ssc_model = SynchrotronSelfComptonSpectralModel(blob)

print(ssc_model.parameters.to_table())

  type      name     value    unit   error      min       max    frozen is_norm link
-------- --------- ---------- ---- --------- --------- --------- ------ ------- ----
spectral       k_e 9.2748e+06 cm-3 0.000e+00 1.000e+00 1.000e+09  False   False     
spectral         p 2.8000e+00      0.000e+00 1.000e+00 5.000e+00  False   False     
spectral gamma_min 1.0000e+02      0.000e+00 1.000e+00 1.000e+03   True   False     
spectral gamma_max 1.0000e+07      0.000e+00 1.000e+04 1.000e+08   True   False     
spectral         z 6.9000e-02      0.000e+00 1.000e-03 1.000e+01   True   False     
spectral       d_L 9.9203e+26   cm 0.000e+00 1.368e+25 3.271e+29   True   False     
spectral   delta_D 1.0000e+01      0.000e+00 1.000e+00 1.000e+02  False   False     
spectral         B 1.0000e+00    G 0.000e+00 1.000e-04 1.000e+03  False   False     
spectral       R_b 1.0000e+16   cm 0.000e+00 1.000e+12 1.000e+18  False   False     
spectral      norm 1.0000e+00      0.000e+00 1.000e-01 1.000e+01   True    True

As @AtreyeeS said, it does not appear in the "grouped" parameters

print(ssc_model.spectral_parameters.to_table())

  type      name     value    unit   error      min       max    frozen is_norm link
-------- --------- ---------- ---- --------- --------- --------- ------ ------- ----
spectral       k_e 9.2748e+06 cm-3 0.000e+00 1.000e+00 1.000e+09  False   False     
spectral         p 2.8000e+00      0.000e+00 1.000e+00 5.000e+00  False   False     
spectral gamma_min 1.0000e+02      0.000e+00 1.000e+00 1.000e+03   True   False     
spectral gamma_max 1.0000e+07      0.000e+00 1.000e+04 1.000e+08   True   False

print(ssc_model.spectral_parameters.to_table())

  type     name    value    unit   error      min       max    frozen is_norm link
-------- ------- ---------- ---- --------- --------- --------- ------ ------- ----
spectral       z 6.9000e-02      0.000e+00 1.000e-03 1.000e+01   True   False     
spectral     d_L 9.9203e+26   cm 0.000e+00 1.368e+25 3.271e+29   True   False     
spectral delta_D 1.0000e+01      0.000e+00 1.000e+00 1.000e+02  False   False     
spectral       B 1.0000e+00    G 0.000e+00 1.000e-04 1.000e+03  False   False     
spectral     R_b 1.0000e+16   cm 0.000e+00 1.000e+12 1.000e+18  False   False

I integrated the computation of the SED with the Gammapy wrappers in the tests, I find quite some discrepancy at the highest energies. I expect exactly identical results as I am using the same functions. Will try to dig deeper later.

gammapy_ssc_wrapper

gammapy_ec_wrapper

cosimoNigro commented 2 years ago

Tests fail because the dev version of Gammapy is needed. Will fix it.

cosimoNigro commented 2 years ago

Ok, now it's complaining about the astropy version... :face_exhaling:

cosimoNigro commented 2 years ago

Ok, very well, the tests were failing because Gammapy - that I am now installing - uses astropy 5.0 that has dropped support for python 3.7. Just removed python 3.7 from the tests.

cosimoNigro commented 2 years ago

Will take time to make a final couple of improvements in the EC wrapper and then merge. In particular, the emission of the thermal components (disk and DT) is missing, also I would like to improve how the target for scattering is defined.

cosimoNigro commented 2 years ago

I completed the integration of the wrapper, but when I try to run the fit on the Mrk421 SED it takes 20-30 minutes now :scream:, against the 2-3 minutes it was taking before (Gammapy 0.18.2). Any idea why now it is taking so long compared to the notebooks with Gamampy 0.18.2? Is it because I am now using several FluxPointsDatasets?

Here a script to run the fit, it is also available in experiments/check_gammapy_fit.py

# import numpy, astropy and matplotlib for basic functionalities
import time
import pkg_resources
import astropy.units as u
from astropy.table import Table
import matplotlib.pyplot as plt

# import agnpy classes
from agnpy.emission_regions import Blob
from agnpy.wrappers import SynchrotronSelfComptonSpectralModel

# import gammapy classes
from gammapy.modeling.models import (SPECTRAL_MODEL_REGISTRY, SkyModel)
from gammapy.estimators import FluxPoints
from gammapy.datasets import Datasets, FluxPointsDataset
from gammapy.modeling import Fit

# IMPORTANT: add the new custom model to the registry of spectral models recognised by gammapy
SPECTRAL_MODEL_REGISTRY.append(SynchrotronSelfComptonSpectralModel)

# total energy content of the electron distribution
spectrum_norm = 5e46 * u.Unit("erg") 
# dictionary describing the electron distribution
spectrum_dict = {
    "type": "BrokenPowerLaw", 
    "parameters": {
        "p1": 2.02,
        "p2": 3.43,
        "gamma_b": 9e4,
        "gamma_min": 500,
        "gamma_max": 1e6
    }
}
R_b = 1.5e16 * u.cm
B = 0.1 * u.G
z = 0.0308
delta_D = 20
Gamma = 20
# emission region
blob = Blob(R_b, z, delta_D, Gamma, B, spectrum_norm, spectrum_dict)

# model
ssc_model = SynchrotronSelfComptonSpectralModel(blob)
ssc_model.parameters["gamma_b"].min = 1e3
ssc_model.parameters["gamma_b"].max = 5e5

print("loading flux points...")
# load the MWL flux points
datasets = Datasets()
flux_points = {}

sed_path = pkg_resources.resource_filename("agnpy", "data/mwl_seds/Mrk421_2011.ecsv")
table = Table.read(sed_path)
table = table.group_by("instrument")

# do not use frequency point below 1e11 Hz, affected by non-blazar emission
E_min_fit = (1e11 * u.Hz).to("eV", equivalencies=u.spectral())

for group in table.groups:
    name = group["instrument"][0]

    data = FluxPoints.from_table(group, sed_type="e2dnde", format="gadf-sed")
    dataset = FluxPointsDataset(data=data, name=name)

    flux_points.update({name: data})
    dataset.mask_fit = dataset.data.energy_ref > E_min_fit
    datasets.append(dataset)

# load the SSC model in the datasets
model = SkyModel(spectral_model = ssc_model, name="Mrk421")
datasets.models = [model]

# plot the starting model and the flux points
fig, ax = plt.subplots(figsize=(10, 6))

for key in flux_points.keys():
    flux_points[key].plot(ax=ax, label=key)

model.spectral_model.plot(energy_bounds=[1e-6, 1e14] * u.eV, energy_power=2, label="model")
plt.legend(ncol=4)
plt.xlim([1e-6, 1e14])
plt.show()
fig.savefig("initial_model.png")

# define the fitter and run it
fitter = Fit()
start = time.time()
print("fitting...")
results = fitter.run(datasets)
end = time.time()
delta_t = end - start
print(f"execution time {delta_t:.2f} s")

print(results)
print(model.spectral_model.parameters.to_table())

# plot the final model and the flux points
fig, ax = plt.subplots(figsize=(10, 6))

for key in flux_points.keys():
    flux_points[key].plot(ax=ax, label=key)

model.spectral_model.plot(energy_bounds=[1e-6, 1e14] * u.eV, energy_power=2, label="model")
plt.legend(ncol=4)
plt.xlim([1e-6, 1e14])
plt.show()
fig.savefig("final_model.png")

The other difference with the tutorials notebooks with Gammapy 0.18.2 is that I was increasing the errors with the systematics, but I don't think it's the reason it is taking so much now. I remember having reasonable times also without adding systematics. Additionally I am using a starting model very close to the SED points.

Any ideas @QRemy, @AtreyeeS, @adonath?

adonath commented 2 years ago

Is it because I am now using several FluxPointsDatasets?

Theoretically yes. At some point the Python loop in Datasets.stat_sum() could become the bottleneck, but I doubt this the case here. The integrations performed within the loop should still be the dominant part of the computation.

Can you compare the number of fit iterations for both cases? Is it the same? My strong guess here is no. It seems the convergence changed here...

adonath commented 2 years ago

The total fit on my M1 machine takes 12 min with 2354 iterations. The number of iterations seems rather high to me...

QRemy commented 2 years ago

The 1-d joint fit of FluxPointsDatasets is not covered by gammapy-benchmark so I don't have the history of evaluation time to compare. Could you try to run a profiler to see what takes most of the time, for example :

pip install pyinstrument
pyinstrument -r html -o check_gammapy_fit.html check_gammapy_fit.py

cosimoNigro commented 2 years ago

Could you try to run a profiler to see what takes most of the time, for example

Done, the most time-consuming tasks seem to be those related with optimisation and flux points. Screenshot from 2022-06-20 16-21-51

full .html here.

AtreyeeS commented 2 years ago

Is it because I am now using several FluxPointsDatasets?

Maybe it can contribute a part of the reason. I tried a very simple test - fit the hawk crab flux points supplied within gammapy data

as one FluxPointsDataset
broken into 10 FluxPointsDataset

The fit results do not change, but the time for fitting increased by a factor of 7, you can see more here: https://github.com/AtreyeeS/test_notebooks/blob/master/test_joint_fit.ipynb

QRemy commented 2 years ago

from agnpy.utils.math import log
%timeit log(np.ones(10))
%timeit [log(np.ones(1)) for k in range(10)]

Arrays are just more efficient... Here, the increase in time is proportional to the number of calls to the log function, so indeed more datasets with less flux points per dataset can explain the difference. The FluxPointsDatasets should probably concatenate the data internally to run as fast as a single FluxPointDataset.

adonath commented 2 years ago

@cosimoNigro, @QRemy and @AtreyeeS I agree it seems the increase in computation time here seems to come from splitting the datasets. However all things considered I think it's still the right way to go:

This case is special in the sense that it only includes flux point datasets and a rather high number of them. I guess once one adds a single MapDataset for Fermi-LAT or a IACT it will dominate the computation time anyway...
Looking at the example, the problem is not the computation time, but rather the model is dominated by systematics. The fit runs >2000 iterations but still does not fit all data points, especially at the low energy end. So the model is well constrained but does not fit the data well.
So it's rather important to add weights and possibly energy scale systematics. I think the latter can only be achieved in a clean way by splitting into datasets and assigning an energy scale parameter to each of them.

The FluxPointsDatasets should probably concatenate the data internally to run as fast as a single FluxPointDataset.

I'm not sure it is worth it. From the Gammapy side we could rather work on:

Support for non-contiguous energy axes, such that flux points can be merged again
Support for parallelization of datasets to improve performance.

But those are bigger development projects...

AtreyeeS commented 2 years ago

Regarding developments on the gammapy side, I tend to agree with @adonath . The best support to provide would be support for non-contiguous energy axis, but that might take some time, and needs separate discussions.

But if something can be done within agnpy to speed up the fitting, it would be good. I dont think its very practical for users to have to spend 30 mins (or even more) to perform a fit. In such a scenario, the utility is severely hampered - and I guess it would be much better to use sherpa or other backends in such cases. Or, if it is possible to have lesser number of datasets somehow. At this point, this splitting by instrument provides a clean methodology, but is maybe not very interesting for modelling purposes

the difference at lower energies probably comes from physical considerations (emission from the unresolved radio core, black body radiation from the host galaxy at optical energies, etc). I am not sure burying these in systematics is the best idea - it might be interesting for users to probe the significance of the Black body radiation for ex
most analysis for agnpy will involve a large number of flux points. I would argue that the fitting with a map dataset would be a more special (albeit more exciting) use case for the present. In that case, the recommendation should be to do as per Axel's suggestions

review-notebook-app[bot] commented 2 years ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

adonath commented 2 years ago

@cosimoNigro Were you able to resolve the convergence issues by changing to the log scaling again?

For Gammapy I opened a feature request as a a reminder https://github.com/gammapy/gammapy/issues/4027

cosimoNigro commented 2 years ago

Yes @adonath, thanks for asking! I will send to the Gammapy devs the link to the docs with the updated notebooks, once I finish everything. The speed of the fit is now 3 minutes, compared to the 2 minutes I had with Gammapy 0.18.2, so everything looks good.

cosimoNigro commented 2 years ago

Ok, I am done with this. I have introduced wrappers both for Gammapy and sherpa models. I have also added some helper functions to load the SED data in a sherpa.data.Data1D object or in a list of gammapy.datasets.FluxPointsDatasets. Proper tests have been added for the new classes / functions added. I can now obtain the same results from the two wrappers for the two sources considered, Mrk421 and PKS1510-089.

I will make a new release after I merge.

cosimoNigro / agnpy

Incorporating the Gammapy wrapper in agnpy #119

Codecov Report