Fitting non X-ray data - Githubissues

fjebaker / SpectralFitting.jl

✨🛰 Fast and flexible spectral fitting in Julia.

https://fjebaker.github.io/SpectralFitting.jl

MIT License

9 stars 5 forks source link

Fitting non X-ray data #70

Open phajy opened 10 months ago

phajy commented 10 months ago

This issue is to start a discussion thread about fitting non X-ray datasets that might be injective, or have gaps in them. The specific example I have in mind is fitting radio to gamma ray data for a blazar.

My currently preferred solution is as follows.

Create a user-defined, logarithmically spaced energy grid with $n$ elements.
Create a corresponding "flux" (or integrated flux; I'm not being careful with units here) grid that has $n-1$ elements initialised to zero.
Read in a data file that has the fluxes as a function of energy and populate the appropriate flux bins. This means the flux array might be sparse with quite a few zeros.
Use an equivalent of data_mask to ignore the bins with no fluxes.

This should then work with the standard fitting routes, allowing XSPEC models to be used without further modification.

fjebaker commented 10 months ago

So is that rebin the data into the user defined energy grid?

fjebaker commented 10 months ago

Or how about a mix of both: user defined energy grid, and then wherever there is data we also add bins with $(x_i - \delta x, x_i + \delta x)$ for the data. Since there is no response folding we have a lot of freedom with how the energy grid is defined.

phajy commented 10 months ago

The idea was to define a grid and then for each data point set the flux in the appropriate bin to that value. If two point fall in the same bin, it will be set to the latest point in the file (so there is some user thought required in setting up a suitable grid with sufficient resolution).

Here's a snippet of code that also does some unit conversion of this particular file.

file = CSV.File(@__DIR__() * "/../data/monotonic_data.csv"; comment="#")

energy_grid = 10 .^ collect(range(-9, 8, 100))
int_f_E = zeros(length(energy_grid) - 1)

for (i, ν) in enumerate(file.nu)
    energy = 4.14E-18 * ν
    index = searchsortedlast(energy_grid, energy)
    if ((index >= 1) && (index < length(int_f_E)-1))
        νF_ν = file.nuFnu[i] * 1.0E-15
        F_ν = νF_ν / ν
        f_E = 1.51E26 * F_ν / energy
        ΔE = energy_grid[index+1] - energy_grid[index]
        int_f_E[index] = f_E * ΔE
    end
end

phajy commented 10 months ago

Or how about a mix of both: user defined energy grid, and then wherever there is data we also add bins with (xi−δx,xi+δx) for the data. Since there is no response folding we have a lot of freedom with how the energy grid is defined.

Yes, that would work too. However, I also realised that I was interested in plotting the model over the entire energy range, not just at the point where there are data value. Although, obviously for fitting we only need to consider the bins that have data.

It might also be generally convenient to be able to switch certain data points on or off, e.g., if I wanted to ignore the infrared data points when fitting. Having a data_mask would allow that. But I need to delve into the code a bit to see how data_mask gets propagated through to making the objective and domain.

fjebaker commented 10 months ago

So after the model is fit, you can plot it over any energy range. It's harder to do with instrument responses, but very straight forward with simpler datasets.

The one thing that I am wary of is that the XSPEC models are obviously integrated over the bin width, so we need to be careful not to have irregularly-spaced bins, as that could invalidate how we fit the model to the data.

The data mask is just a boolean for each domain point (channel) and is currently used to select only those channels for which there is data before calculating the fit statistic.