cta-observatory / pyirf

Python IRF builder
https://pyirf.readthedocs.io/en/stable/
MIT License
15 stars 25 forks source link

Interpolation of IRFs #124

Closed jsitarek closed 2 years ago

jsitarek commented 3 years ago

I started working on a code to interpolate IRFs between different observation conditions (namely zenith and azimuth for the moment) in the context of having it for LST. Since we do not have yet a grid of Zd and Az MC simulation for LST I used for testing the CTAN files at zd = 20 and 40 with N and S pointing available here: https://www.cta-observatory.org/science/cta-performance/#1472563157332-1ef9e83d-426c This is just the start of the work, for the moment I only read in the effective areas at different energies and different offsets from the camera centre, then I perform a simple linear interpolation of log Aeff w.r.t cos zenith and azimuth (the macro is written such to easily allow changing of the interpolation parameters or adding more of them) and plot the resulting Aeff for requested cosZd,Az parameters: aeff_interpolation_early_example

things to be done:

left out for the future:

There are also some issues. It seems to me that the detailed format of IRFs (e.g. the content and naming inside the FITS files is not specified. I did not find this information in the IRF requirements draft, and also the prod3 IRFs that I'm using for testing and the IRFs produced with calculate_eventdisplay_irfs.py have differently looking fits files, they contain different fields, and even if the same information is included (like effective area) it is called differently EFFECTIVE_AREA vs EFFECTIVE AREA

I can open a new branch and start pushing the code in. Any preference where to put the file (I called it interpol_irf.py), should it go to the main pyirf dir, or to irf subdir?

kosack commented 3 years ago

If I understand right, here you are describing the tool that goes from a global set of IRFs (an IRF for each zenith, azimuth, and possible other parameters like NSB, etc), and creating a marginalized IRF, which is averaged over dimensions, weighted by the distribution of a particular observation parameter during a run, correct?
So the input for example the zenith and azimuth distribution (event or time weighted?) + the all of the IRFs and the output is a single marginalized IRF that can be used by science tools.

For the format of the "global IRFs" (the input), The only format I know if is from : https://gamma-astro-data-formats.readthedocs.io/en/latest/irfs/full_enclosure/index.html, which so far is only for already interpolated IRFs (marginalized to 2D). So we could just use same format, but split each bin into a different file or HDU with a well-defined structure or naming scheme (could even be caldb). At the very least, the header keywords of each file should contain the definition of which bin the IRF corresponds to, such that it is possible to read all headers and make a table that shows which bin coordinate each file corresponds to. That's the simplest solution.. Then you would perhaps read it by just reading first the table of bin locations, and then loading the corresponding IRFs into memory, e.g. for effective area something like:

zenith_deg azimuth_deg ... irf_file
20.0 180.0 ... "./zen20/az180/aeff.fits"
40.0 180.0 ... "./zen40/az180/aeff.fits"

In that way, the format of IRFs is identical wether or not they are marginalized (other than some additional keywords identifying the bin location in observation phase space), so tools that can read and display an IRF will work. And the rest is just bookkeeping.

This is not just a problem for IRFs, but for all other lookup tables or machine-learning models we use that are generated for multiple bins in phase space (e.g. ImPACT reconstruction, etc). So having a nice flexible scheme for this is useful.

jsitarek commented 3 years ago

Hi @kosack

the idea is more or less what you say now, however for the moment I was thinking more or averaging the parameters over the whole run and interpolating between the IRF grid (since the situation at least in the near future will be that the changes within a single run will be much smaller than the grid separation. We can of course update it later on use averaging instead of interpolation.

thanks for the link. for the collection area it says <<The recommended EXTNAME keyword is “EFFECTIVE AREA”.>>, so probably the name in calculate_eventdisplay_irfs.py (and whatever scripts are using it later on) should be updated. If I find more difference I will report them (probably as a separate issue).

about passing the arguments I'm not working more or less on what you suggest. I want to pass such a table via a json confg file, and I think I even found a way to pass the arguments (and read them from the data) the same way, lets see...

kosack commented 3 years ago

the idea is more or less what you say now, however for the moment I was thinking more or averaging the parameters over the whole run and interpolating between the IRF grid (since the situation at least in the near future will be that the changes within a single run will be much smaller than the grid separation. We can of course update it later on use averaging instead of interpolation.

Yes, that is what I meant, in fact. Except I just suggest allowing a weighted average using the distribution of the parameter in time over a run, not just the mean, which is more flexible since then you can generate an IRF for situations where one parameter changes rapidly only in one part of a run (e.g. azimuth near zenith), so what is usually done is you make a histogram of the time spent at each zentih and azimuth in a run, and use that to weight the global IRFs in the average. The main reason I mention this is that the length of a run is not defined, and could be hours.

moralejo commented 3 years ago

So the input for example the zenith and azimuth distribution (event or time weighted?) + the all of the IRFs and the output is a single marginalized IRF that can be used by science tools.

I think it has to be time weighted, if one does it event weighted it would be some sort of "double-counting" - since # of events will depend on effective area.

This is not just a problem for IRFs, but for all other lookup tables or machine-learning models we use that are generated for multiple bins in phase space (e.g. ImPACT reconstruction, etc). So having a nice flexible scheme for this is useful.

That is true, but for the simpler "RF-style" approach on which we rely now, we can just grow the forests with a training sample with a continuous distribution of e.g. zenith and azimuth, and pass those as input parameters to the RF. Then the MC test samples for IRF computation can have discrete values of the same quantities.

kosack commented 3 years ago

thanks for the link. for the collection area it says <<The recommended EXTNAME keyword is “EFFECTIVE AREA”.>>, so probably the name in calculate_eventdisplay_irfs.py (and whatever scripts are using it later on) should be updated. If I find more difference I will report them (probably as a separate issue).

Yes, we should check that what PyIRF does is compatible. It's easy to do so, since gammapy and ctools both support this format, so we can just try loading one and if it fails, there is a problem. I believe with gammapy at least the extname is configurable, though.

jsitarek commented 3 years ago

Hi @kosack and @moralejo I agree with you that in general we need this averaging over time, however this is a separate problem to this interpolation. If we have runs that are short and IRF that have a coarse grid we need interpolation, if it is the other way around with longer runs and fine IRF grid we need averaging. You cannot do averaging with coarse IRF grid because it will basically use the closest IRFs (that are still very far from being correct), so I think we need two separate macros. For the moment I have the whole grid of IRFs constructed from 4 MC settings (zd =20,40 deg and Az=0, 180), so I think we are very much in the coarse IRF case :-), so let me first work on this. Nevertheless once we have some fine grid IRF MCs produced I can also look into making such an IRF averaging script

moralejo commented 3 years ago

the idea is more or less what you say now, however for the moment I was thinking more or averaging the parameters over the whole run and interpolating between the IRF grid (since the situation at least in the near future will be that the changes within a single run will be much smaller than the grid separation. We can of course update it later on use averaging instead of interpolation.

Probably for a start we can define that each LST1 run (as long as we wobble every 20') is a "Good Time Interval" and what you say is correct. But it may make sense to plan already for an input consisting of eff. time vs (Alt, Az), rather than averaged values, and an interpolation algorithm which makes the best use of the available MC grid.

maxnoe commented 3 years ago

Sorry, I had already typed a long answer but it was lost by accidentally closing a tab, let me try to answer some things quickly first and then some more general thoughts after.

Yes, we should check that what PyIRF does is compatible. I

We do this already. Each HDU we write out is checked to be readable by gammapy and with the schema validator I wrote for the GADF fits format. A ctools based test is on my todolist but the ctools python apis not the most user friendly...

saving the IRFs to a new fits file

Saving IRFs to FITS is already implemented. We here just need to provide new functions that do the interpolation from many IRFS at certain grid points to a new IRF at an input point.

computing all the other IRFs (migration matrix, PSF, not sure if we need to do it also for background rates)

For point-like IRFs, you need only effective area and the energy migration. For full-enclosure you additionally need the PSF and for some use cases (most importantly purely simulation based studies) you need the background.

In the end, we should provide interpolation solutions for all of them, but starting with providing them for the simpler use-case of point-like IRFs like needed now for LST would be a good starting point.

finding a clever way of defining and passing to the macro the table of input FITS files together with their corresponding parameters

For implementing the interpolation scheme, start with a function that gets the grid of IRFs, probably as astropy table, and the point for which it should be interpolated.

IO and a command line tool to do this should come second and can build on the function.

It seems to me that the detailed format of IRFs (e.g. the content and naming inside the FITS files is not specified. I did not find this information in the IRF requirements draft, and also the prod3 IRFs that I'm using for testing and the IRFs produced with calculate_eventdisplay_irfs.py have differently looking fits files, they contain different fields, and even if the same information is included (like effective area) it is called differently EFFECTIVE_AREA vs EFFECTIVE AREA

We are following the GADF specification for pyirf. The documentation is here: https://gamma-astro-data-formats.readthedocs.io/en/latest/

The prod3b files were released before the last iteration of the standard was published, they differ in some aspects.

The specification uses the header keywords HDUCLAS<N> to sepcify the kind of HDU you are looking at and only sometimes recommends a specific EXTNAME. Science tools however should not rely on extnames but on these HDUCLAS<N>.

For example the point like effective area must have

HDUCLAS1 = ‘RESPONSE’
HDUCLAS2 = ‘EFF_AREA’
HDUCLAS3 = ‘FULL-ENCLOSURE’
HDUCLAS4 = ‘AEFF_2D’

Note that CTA is currently evaluating the format and will redefine / add specifications in the future and pyirf then needs to adapt to the new CTA / GADF format probably.

But that should not be a concern for this interpolation task, since the interpolation happens on the in-memory representation of the IRFs and then can use the functions in pyirf.io to store them in the correct format to fits.

Decoupling the functionality from the IO makes it then possible to even support different formats / versions of the spec.

Any preference where to put the file (I called it interpol_irf.py), should it go to the main pyirf dir, or to irf subdir?

Depending on how much code it is, I would say either create a file pyirf/interpolation.py or a submodule pyirf/interpolation. Like I said before, this should not be a "macro" or "script" or "exe" but like the rest of pyirf a set of functions to perform these tasks so that pipelines can use it.

Since the input and output are both GADF IRF files, we could think about directly providing a command line tool, but we should start with the library functions actually doing the interpolation on numpy arrays / astropy tables.

moralejo commented 3 years ago

If we have runs that are short and IRF that have a coarse grid we need interpolation, if it is the other way around with longer runs and fine IRF grid we need averaging. You cannot do averaging with coarse IRF grid because it will basically use the closest IRFs (that are still very far from being correct), so I think we need two separate macros.

Imagine the case in which the whole distribution of zenith is between 20 and 40. You can just divide the distribution in "closer to 20" and "closer to 40", from which you get two weights to average the IRFs. It is not more complicated than the interpolation I think, basically the same. Instead of "distance from mean zenith to 20 and to 40" for the interpolation, you would use "fraction of distr. closer to 20 and fraction closer to 40". The advantage is that the method can be written such that with a finer grid it would give a more precise result.

maxnoe commented 3 years ago

One thing that I want to highlight is that effective area is the easiest, as there are now normalization constraints.

However for PSF and energy migration, we talk about probability densities and these properties must hold true also after the interpolation.

So a first step here would probably be to write a validation tool that can test if a PSF / energy migration is correctly normalized to make sure we keep these properties in the interpolation.

jsitarek commented 3 years ago

Hi,

I pushed what I made so far into my fork: https://github.com/cta-observatory/pyirf/compare/master...jsitarek:interpolate_irfs for the moment everything is in a single macro, but I will of course extract the actual calculation into a separate class.

@moralejo it is of course possible to have such a run, but still I think most of the cases will be different. I'm afraid we will not escape having two different approaches for interpolation and averaging, either way, as I wrote before, let me first do this interpolation one (keeping in mind the averaging) and then when we have actually something to test it on to do the averaging

@maxnoe I can addapt the code to check for those HDUCLAS tags, but it is cumbersome to loop over all the keys to check which one matches those. If we need to agree on the names instead of agreeing on a bunch of HDUCLAS would be easier to just agree on one EXTNAME And yes, I'm aware that Aeff is the simplest one (that's why I started from it ;-)) and that for the migration matrix and PSF there is a normalization issue. Again, I would like to have some reasonable date to test it on to check what really works best, but my first idea was to try simple interpolation and then renormalize what we get.

maxnoe commented 3 years ago

@jsitarek I have no problem with using specific extnames for files we create. But the standards says that tools should not rely on extnames, so we should not enforce it when reading.

Since we right now do not yet have reading implemented, we can make a simple function that loads the irfs from the hdus into a nice data structures.

This is also kind of the open issues with respect to gammapy. At the moment we use quantities / arrays. gammapy uses classes with more functionality but they had as of the last few gammapy versions unresolved issues with respect to the GADF format.

jsitarek commented 3 years ago

@moralejo @kosack - one more thing that came to my mind concerning the IRF averaging instead of interpolation and long runs. The procedure to average the IRFs over the observation time is perfectly fine if we talk about the spectrum, or the average flux during the run, but if you need averaging it means that the IRFs change considerably during the run, and this means that if the user makes a LC with binning finer than a single run they can produce wrong LCs (or bin-wise SEDs). I guess this must have been discussed in the past, so what is the idea there? if the end user just starts from DL3 files and all the analysis is run wise this means that the runs provided cannot be too long. Other possibilities is to provide a tool that cuts runs into smaller pieces and recomputes the IRFs (but those will have to be recomputed from the IRF library, the IRF information in the run would not be sufficient), or instead of a single IRF to provide an array of IRFs that describe the IRF evolution in time (the last one I think is a bad idea because it is making things complicated, and it boils down to having smaller runs with individual IRFs, but all joint together).

about the IRF reading, what I've understood from Abelardo and Ruben is that this interpolation of IRFs is rather urgently needed, so waiting for a new implementation of reading of IRFs into a dedicated class I think could take too long. So I would propose to keep the current scheme of keeping them as arrays in the interpolation code, and then once we have a new class I can update this part of the code (the class of IRFs will either way have to be translated back into arrays, is it fine with you @maxnoe ?

rlopezcoto commented 3 years ago

Hi Julian, thanks for this very needed implementation, a couple of comments into your latest one:

@moralejo @kosack - one more thing that came to my mind concerning the IRF averaging instead of interpolation and long runs. The procedure to average the IRFs over the observation time is perfectly fine if we talk about the spectrum, or the average flux during the run, but if you need averaging it means that the IRFs change considerably during the run, and this means that if the user makes a LC with binning finer than a single run they can produce wrong LCs (or bin-wise SEDs). I guess this must have been discussed in the past, so what is the idea there? if the end user just starts from DL3 files and all the analysis is run wise this means that the runs provided cannot be too long. Other possibilities is to provide a tool that cuts runs into smaller pieces and recomputes the IRFs (but those will have to be recomputed from the IRF library, the IRF information in the run would not be sufficient), or instead of a single IRF to provide an array of IRFs that describe the IRF evolution in time (the last one I think is a bad idea because it is making things complicated, and it boils down to having smaller runs with individual IRFs, but all joint together).

I think that here it would be a good idea to define what is a "significant change" during the run to evaluate the validity of the interpolated and averaged IRFs. We could either select this based on the initial and final parameters given as input (Zd, Az, ...) or better calculate if the difference on the initial and final interpolated IRFs for the initial and final parameters is significant enough to have to produce different IRFs for the same run.

about the IRF reading, what I've understood from Abelardo and Ruben is that this interpolation of IRFs is rather urgently needed, so waiting for a new implementation of reading of IRFs into a dedicated class I think could take too long. So I would propose to keep the current scheme of keeping them as arrays in the interpolation code, and then once we have a new class I can update this part of the code (the class of IRFs will either way have to be translated back into arrays, is it fine with you @maxnoe ?

you are right @jsitarek this is something really needed at the moment to properly evaluate the performance of LST and it would be great if it can be implemented as soon as possible, as you have rushed into doing it. If in the meantime @maxnoe can shed some light on a quick way of load them, or even better work on a new class to read and load the IRFs, that would be great

moralejo commented 3 years ago

The procedure to average the IRFs over the observation time is perfectly fine if we talk about the spectrum, or the average flux during the run, but if you need averaging it means that the IRFs change considerably during the run, and this means that if the user makes a LC with binning finer than a single run they can produce wrong LCs (or bin-wise SEDs).

I don't know how that will be done. @kosack: is it a requirement that IRFs during each good time interval are stable (with whatever tolerance) also from the point of view of e.g. zenith- and azimuth- dependent variations? Or, alternatively, is the plan to add zenith and azimuth axes to the IRFs?

moralejo commented 3 years ago
  • [ ] propagating uncertainties (??)

Related to this: probably not urgently needed for now. We are going to reuse the MC in different observations of the same source (e.g. consecutive nights, similar range of zenith and azimuth). So even if you calculate the uncertainties in the IRFs they cannot be used in a straightforward way in later analysis due to the strong correlations. Actually I do not think the post-DL3 analysis of real data uses now those uncertainties at all. With large enough MC productions, anyway, those statistical uncertainties should be made negligible w.r.t. systematic ones related to MC-data mismatch (e.g. energy scale), and also relative to the statistical uncertainties of the real data.

kosack commented 3 years ago

I don't know how that will be done. @kosack: is it a requirement that IRFs during each good time interval are stable (with whatever tolerance) also from the point of view of e.g. zenith- and azimuth- dependent variations? Or, alternatively, is the plan to add zenith and azimuth axes to the IRFs?

I think we are mixing two IRFs here... 1) the global IRFs that get interpolated/averaged to produce the 2) IRFs attached to each GTI, which to start with will be marginalized into 3D (FOV coordinates + energy). For the GTI IRFs, yes the requirements is that it is stable (within systematic requirements) during the GTI. If we find that we cannot easily make things stable within a GTI, then we just have to add another dimension or break the GTIs in to smaller chunks. Remember that "runs" are an outdated concept here - the length of the run should not matter, that is why we have GTIs.

Anyhow, I wanted to point out that it is very good to start to produce functions to do this interpolation/averaging since it is not just needed for LST, but also urgently in order to do a more realistic data challenge for the PHYS group, one where sources actually move in alt/az.

kosack commented 3 years ago

I think that here it would be a good idea to define what is a "significant change" during the run to evaluate the validity of the interpolated and averaged IRFs. We could either select this based on the initial and final parameters given as input (Zd, Az, ...) or better calculate if the difference on the initial and final interpolated IRFs for the initial and final parameters is significant enough to have to produce different IRFs for the same run.

This issue is not really the place to discuss that level of detail: this is defined already by the calibration and systematic requirements and there has been a lot of discussions on this and there are even documents that exist from the ACE group on how to determine when to switch IRFs, so maybe let's not get too worried right now - with simple simulations and basic LST1 data, it's not a critical issue. For now, we can just say GTI == run (unless there are clouds or something), and use a single IRF per run. Later, we can get fancier.

I suggest to just start simple as you do now, make a branch and open a draft PR so we can discuss the implementation. To start, I think simple averaging will be fine. Later, we can extend it to interpolation, which will likely be needed, especially as we have very coarse bins in zenith, asimuth, and offset with current Prod5 sims.

moralejo commented 3 years ago

I don't know how that will be done. @kosack: is it a requirement that IRFs during each good time interval are stable (with whatever tolerance) also from the point of view of e.g. zenith- and azimuth- dependent variations? Or, alternatively, is the plan to add zenith and azimuth axes to the IRFs?

I think we are mixing two IRFs here... 1) the global IRFs that get interpolated/averaged to produce the 2) IRFs attached to each GTI, which to start with will be marginalized into 3D (FOV coordinates + energy). For the GTI IRFs, yes the requirements is that it is stable (within systematic requirements) during the GTI.

I think there is no mixing. I am assuming we start from a grid of IRFs in e.g. zenith and azimuth (this is what you call 1, right?) Then the IRF for a given GTI, (2), is obtained through interpolation. If GTI is short enough that Zd- Az- variations within it are negligible (IRF-wise) then we are in the situation that an interpolation to the average Az, Zd of the GTI is good enough.

If we find that we cannot easily make things stable within a GTI, then we just have to add another dimension or break the GTIs in to smaller chunks. Remember that "runs" are an outdated concept here - the length of the run should not matter, that is why we have GTIs.

Ok. I think the extra dimension would be a good option (for a given pointing in RA,dec, the hour angle of that point would be better than having Az, Zd axes). In that case (or also with the alternative short GTIs approach) we would never need the "IRF averaging with the (Az,Zd) t_eff distribution", because each slice of the new axis would be obtained by interpolation.

kosack commented 3 years ago

I think there is no mixing. I am assuming we start from a grid of IRFs in e.g. zenith and azimuth (this is what you call 1, right?) Then the IRF for a given GTI, (2), is obtained through interpolation. If GTI is short enough that Zd- Az- variations within it are negligible (IRF-wise) then we are in the situation that an interpolation to the average Az, Zd of the GTI is good enough.

Exactly

For the extra dimensions/shorter GTIs, that's something for a future study in ASWG. I should note that a single GTI per 28-minutes run with a 3D averaged IRF has so far been fine for e.g. the HESS GPS (analyzed with GammaPy or CTools), so I would guess it will be sufficient for CTA for the first few years at least, or at least until the systematic requirements of CTA are finalized. The coordinates of the spatial part of the IRF (and the corresponding DET coordinates of the event list) is another long-standing question (with lots of argument) - alt/az vs ra(ha)/dec. These should be added to the ASWG task list.

jsitarek commented 3 years ago

Hi @kosack ,

The variations of the IRFs within a run depend on how fast Zd/Az change for it. In MAGIC we do some observations at very large zenith angles of sources that culminate at low zenith, and for such the IRFs would change considerably every minute or so. Nevertheless, I agree that at least for the time being and bulk of the sources single GTI of a 20-30 min would work reasonably well if you do not get too close to the threshold.

Julian

jsitarek commented 3 years ago

I made a PR #125 with the IRF interpolation of Aeff and migration matrix

maxnoe commented 2 years ago

I am closing this in favor of #161, since this was about the initial implementation