javierrico / gLike

General-purpose ROOT-based joint likelihood maximization code
GNU General Public License v3.0
7 stars 11 forks source link

Defining input formats for ROOT and FITS data #25

Closed cosimoNigro closed 4 years ago

cosimoNigro commented 4 years ago

When PR #24 will be approved, gLike will be able to work with ROOT and FITS input.

A problem remains though and it is the fact that the test ROOT files already contain IactEventListIrf objects, rather than plain ROOT objects. This makes impossible to read the input test files from a constructor within the IactEventListIrf class itself (or from some Load method).

We should work to establish ROOT and FITS input formats, and in the ROOT case they should contain plain ROOT objects from which the attributes of the IactEventListIrf class can be initialised (e.g. TNtuples for fOnSample and fOffSample, TH1F for fHAeff). I will take care of the FITS format but I'd need @javierrico and @dkerszberg to help me decide what to do with the ROOT format. @dkerszberg, how many gLikeInputs files are you currently using for your scientific projects? Would it be a big effort to convert their content to plain ROOT objects, in case we decide to adopt such input format for gLike?

javierrico commented 4 years ago

Now that I think about it, your premise doesn't seem to be correct, does it? It is quite typical to have Copy constructors for many classes. If you had it for the IactEventListIrf, problem solved, right? maybe I am missing something here...

cosimoNigro commented 4 years ago

I was not aware of copy constructors and will take a look. The reason I believe it is better to have plain ROOT objects in input is that there might be users interested in using gLike with their own ROOT data. At the moment - maybe I am wrong - the only way to produce the ROOT inputs of gLike is to use the glikeInputs.cc in the MARS software. Correct?

javierrico commented 4 years ago

That is correct, but there is nothing preventing this potential user to use the IactEventListIRF (which is in gLike distribution) to create those output files. One potential problem that I see (maybe I am wrong) is that in principle a user could create a plain root file with, e.g., several effective area histograms, what would happen in that case?

dkerszberg commented 4 years ago

Maybe we can provide in scripts a simple macro creating a very simple dummy - fake - dataset using gLike structure (flat histo for effective area, N events with random energy etc) that could serve as template for users to create their own? (@cosimoNigro I have 15-20 in total, technically they can be modified, let's see if it is necessary or not)

cosimoNigro commented 4 years ago

Hi @javierrico and @dkerszberg,

I have no strong opinion against using ROOT files already containing the IactEventListIrf instance. As @dkerszberg suggested we can include a simple script showing how to generate mock ROOT glikeInputs.

A small update on the FITS side: the class writing the FITS glikeInputs starting from gammapy is almost done, I am waiting for the developer approval, see this PR. At the moment the developers prefer to include the code in the extras folder rather than in the main code, they feel it has to be discussed more how to include unbinned formats in gammapy. Anyhow, once the PR is approved, we can point our users to gammapy-extra.

Last, the gammapy developers and @javierrico suggested to store the whole migration matrix rather than the resolution and bias. And here the pain begins. The current interface of ROOTto FITS file (TFITSHDU class) does not support the multidimensional table that we currently use to store the migration matrix in FITS format - see my detailed explanation in the ROOT forum. A small patch to the TFITSHDU class seems to be needed. I think it is better solving the problem in ROOT rather than implementing a local patch in gLike(also we would need to introduce the CFITSIO dependency in gLike for that and I presume we want to rely only ROOT as much as possible).

I am afraid that to complete my PR #24 I would need to fix TFITSHDU in ROOT first.

javierrico commented 4 years ago

That's a real bummer, because even with the ROOT patch you are forcing a very new ROOT version, and many people are several versions behind. For instance I think MAGIC analysis tools are certified for root 5 and some do not even work in root 6... I know from Abelardo that they wanted to fix that, but I don't know how advanced they are with that... (maybe you can ask him). Is there no other alternative format for the migration matrix that can be read with current root version(s)?

cosimoNigro commented 4 years ago

That's a real bummer, because even with the ROOT patch you are forcing a very new ROOT version, and many people are several versions behind.

If we learn how to ship the code properly along with the ROOT version, e.g. with anaconda, then it would not a problem.

For instance I think MAGIC analysis tools are certified for root 5 and some do not even work in root 6...

MAGIC analysis tools work only with ROOT v5.34.21 and v5.34.36 and that's it. This is making increasingly difficult (or impossible) to install them on new machines with recent versions of g++

Is there no other alternative format for the migration matrix that can be read with current root version(s)?

No, because the migration matrix for the unbinned spectral data is just the same migration matrix for the binned spectral data (but with a very fine binning). Therefore if we want to make Iact1dUnbinnedLkl and Iact1dBinnedLkl read the data produced by gammapy (compatible with the gamma-astro-data-format specifications) we have to read this format of migration matrix!

Either we force a new ROOT version to be used (if I manage to make the patch in ROOT) or we make a local fix with the CFITSIO in gLike. Or we don't read FITS files at all.

javierrico commented 4 years ago

Ok, adding dependence on cfitsio just for this is excluded, and so is dropping the possibility of reading FITS files, so we need to go for your solution.

The ROOT version is in an environment variable, can you code the interface with FITS so that it depends on the version? e.g. if you are using root version < 6.XXX [your patch] then when trying to read a FITS file should issue an error or warning, and if version>=6.XXX the it would work fine. Does that sound reasonable?

cosimoNigro commented 4 years ago

Hi, sorry for being silent. Yes your solution is reasonable and I'll implement it. The patch fixing this issue in ROOT has been submitted https://github.com/root-project/root/pull/5099 Once it's accepted I will finish with PR #24.

cosimoNigro commented 4 years ago

Dear @javierrico and @dkerszberg,

the ROOT patch has been approved and PR #24 finished: now gLike's IactEventListIrf can read the FITS spectral data produced by gammapy. This is what gammapy produces reducing DL3 data to unbinned spectral data (ON and OFF event lists, effective area and migration matrix): gammapy_fits_file this is instead what gLike can now store in a IactEventListIrf object (I added a function to make a plot similar to gammapy's one) gLike_fits_data

The development in PR #24 is in this branch of my fork.

Now the "bummer" of the ROOT version remains. The problem is that to read the format of the migration matrix I have added a new function (GetTabVarLengthVectorCell) in ROOT's TFITSHDU class. Therefore if we try to make gLike with an old ROOT version, we will produce a crash while compiling

Compiling src/IactEventListIrf.cc...
g++ -fPIC -I./include `root-config --cflags` -c src/IactEventListIrf.cc -o out/IactEventListIrf.o
src/IactEventListIrf.cc:356:38: error: no member named 'GetTabVarLengthVectorCell' in 'TFITSHDU'
    TArrayD *fChanArray = hduMatrix->GetTabVarLengthVectorCell(rownum, "F_CHAN");
                          ~~~~~~~~~  ^

This creates the situation that @javierrico did not want: users and developers are forced to use the latest ROOT version.

To solve this situation I propose to take some time to understand how to ship gLike with anaconda: root is already shipped with conda, so it should be possible for gLike as well. This will also ease installation for users and allow us to provide a ROOT version compiled with the CFITSIO libraries.

The release v00.08.00 (the current one) would be shipped with an old ROOT version (e.g. 6.16, 6.18), the release incorporating PR #24, will be shipped with ROOT version > 6.21 (the one including my patch on TFITSHDU).

Developers, though, will be forced to use the newest ROOT version, but I think this would be a minor problem for us.

Let me know what you think, if you have some alternative approach in mind.

cosimoNigro commented 4 years ago

I think the problem with the CFITSIO dependency in ROOT is already solved. As I said ROOT is shipped with conda. What I learned is that the version shipped is the latest one by default (by now the one including my patch in TFITSHDU). Not only that, but ROOT is compiled with most of its extensions (including the ones we need: Minuit and CFITSIO).

I assume you have anaconda / miniconda on your machine, if you don't here the instruction to install it. Just create a new environment to install root (it might mess-up with other packages you have installed with conda), better to do it in a clean environment

cosimo [/Users/cosimo]$ conda create --name root_cern 

then activate it, and install root

cosimo [/Users/cosimo]$ source activate root_cern
(root_cern) cosimo [/Users/cosimo]$ conda install -c conda-forge root

you wait a couple of minutes and then

(root_cern) cosimo [/Users/cosimo]$ root
   ------------------------------------------------------------------
  | Welcome to ROOT 6.20/04                        https://root.cern |
  | (c) 1995-2020, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for macosx64 on Apr 20 2020, 14:58:00                      |
  | From tag , 1 April 2020                                          |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q'       |
   ------------------------------------------------------------------

root [0] 

Here is a ROOT version with the CFITSIO and Minuit extensions installed, with the patch in TFITSHDU to read the migration matrix already implemented. Remember to activate the environment each time you need to use ROOT. So all our problems are solved by recommending users and developers to install root with conda. This is also the most efficient way of installing ROOT I have ever seen!

I'll keep trying to ship glike itself with conda. I had some problems with installation that I hope PR #33 could solve.

javierrico commented 4 years ago

Ok then, I think you convinced all of as about the conda solution, we're getting modern... :P We need good documentation about that in the gitHub Also, is there a way to make the possibility of reading the FITS files an option that can be activated/deactivated at compilation? Then the Makefile could check the ROOT version and activate/deactivate the compilation of this optional module on the fly. I believe that this is technically feasible, right?

javierrico commented 4 years ago

By the way I try to use the conda ROOT distribution but I have failed, maybe you can help me fix it. What I did:

conda create --name root_cern [seemed to work]

source activate root_cern [seemed to work]

conda install -c conda-forge root [took a while, did a lot of stuff and seemed to work]

root [then my old v5.34.36 root version was launched]

/Users/jrico/anaconda/envs/root_cern/bin/root objc[2754]: Class RunStopper is implemented in both /Users/jrico/anaconda/envs/root_cern/lib/libCore.6.20.so (0x103e0a660) and /cern/root/root/lib/libCore.so (0x104b7c898). One of the two will be used. Which one is undefined. Fatal in : cannot load library dlopen(/Users/jrico/anaconda/envs/root_cern/lib/libCling.so, 5): Symbol not found: __ZN8TMemFileC1EPKcRKNS_14ZeroCopyView_tE Referenced from: /Users/jrico/anaconda/envs/root_cern/lib/libCling.so Expected in: flat namespace in /Users/jrico/anaconda/envs/root_cern/lib/libCling.so

export ROOTSYS="/Users/jrico/anaconda/envs/root_cern" export PATH=".:${ROOTSYS}/bin:${MARSSYS}:/usr/local/mysql/bin:/usr/texbin:/opt/local/bin:" DYLD_LIBRARY_PATH="${ROOTSYS}/lib:${MYSQL}/lib:${MARSSYS}:${GLIKESYS}:${MDMSYS}:${CCFITSDIR}/lib:${GIRFDIR}:${MFITSDIR}:${NEXTSYS}"

root dyld: Library not loaded: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib Referenced from: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib Reason: Incompatible library version: vecLib requires version 1.0.0 or later, but libLAPACK.dylib provides version 0.0.0 Abort trap: 6

cosimoNigro commented 4 years ago

Hi @javierrico,

1)

Also, is there a way to make the possibility of reading the FITS files an option that can be activated/deactivated at compilation?

No because now TFITSHDU is imported in the IactEventListIrf.h class definition. https://github.com/cosimoNigro/gLike/blob/read_fits_inputs/include/IactEventListIrf.h#L17 It will not compile if you do not have the FITS extension. You should exclude that class at compilation to compile gLike without the FITS extension, but that of course will not make sense.

Also when filling the migration matrix in the IactEventListIrf.cc class implementation https://github.com/cosimoNigro/gLike/blob/read_fits_inputs/src/IactEventListIrf.cc#L356 this function is called, TFITSHDU->GetTabVarLengthVectorCell, that is in available only in the most recent ROOT version (>6.20)

TArrayD *fChanArray = hduMatrix->GetTabVarLengthVectorCell(rownum, "F_CHAN");

If your concern is backward compatibility I think we have to drop it.

2)

By the way I try to use the conda ROOT distribution but I have failed, maybe you can help me fix it. What I did:

Could you retry the installation without those variables declared in your bashrc (I mean the classical ROOTSYS, MARSSYS)? I have none of them in my bashrc and it works just fine. I think that they are set automatically by the conda environment

cosimo [/Users/cosimo/work]$ source activate root_cern
(root_cern) cosimo [/Users/cosimo/work]$ echo $ROOTSYS         
/Users/cosimo/software/miniconda3/envs/root_cern
(root_cern) cosimo [/Users/cosimo/work]$ echo $PATH            
/Users/cosimo/software/miniconda3/envs/root_cern/bin:/Users/cosimo/software/miniconda3/condabin:/Users/cosimo/software/miniconda3/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin
javierrico commented 4 years ago

Hi @cosimoNigro

Also, is there a way to make the possibility of reading the FITS files an option that can be activated/deactivated at compilation?

No because now TFITSHDU is imported in the IactEventListIrf.h class definition. https://github.com/cosimoNigro/gLike/blob/read_fits_inputs/include/IactEventListIrf.h#L17 It will not compile if you do not have the FITS extension. You should exclude that class at compilation to compile gLike without the FITS extension, but that of course will not make sense.

Couldn't this be done with preprocessor directives? With this you can effectively compile (slightly) different versions of the code depending e.g. on the value of a certain environment variable (e.g. the root version). This is done quite often in Mars (grep the *.cc files for "ROOT_VERSION_CODE" to see some examples) and I think we could use the same approach here, i.e. the pieces of code that need Root6 could be surrounded by

if ROOT_VERSION_CODE > ROOT_VERSION(6,20,00)

...

ifdef

directives

If your concern is backward compatibility I think we have to drop it.

more than backward compatibilities I am concern on making access to the code easier, by providing a way to compile with previous root versions for people not using any FITS funcionality That can only sum users...

javierrico commented 4 years ago

Hi @cosimoNigro

By the way I try to use the conda ROOT distribution but I have failed, maybe you can help me fix it. What I did:

Could you retry the installation without those variables declared in your bashrc (I mean the classical ROOTSYS, MARSSYS)? I have none of them in my bashrc and it works just fine. I think that they are set automatically by the conda environment

Yes, that worked, but I still do not like it much. Some users (like me) may want to try the conda thing but without dismantling their current ROOT implementation and they will all face this problem right? Is there a way to fix this other than forcing all potential users to remove ROOTSYS from their bashrc files?

cosimoNigro commented 4 years ago

Ok I did not know you could exclude sections of the code with the preprocessor directives, I guess you want me to implement it in PR #24. I think the functions using FITS files are more or less grouped.

As for the installation I'll try to install manually ROOT myself and see how to avoid the clash.

cosimoNigro commented 4 years ago

Ok, PR #24 isolates with preprocessor directives the section of codes using TFITSHDU. You can now build gLike also with an older ROOT version, giving up the support for FITS data.

I added a note in the README.md and created a documentation page for the data formats.

I think this closes this issues, ROOT and FITS formats are defined and documented.

As for the discussion of the ROOT installation with or without conda I think it can be continued elsewhere and it's not in the scope of the discussion on the input data.

cosimoNigro commented 4 years ago

I am closing this, as PR #24 was finally approved.