gammapy / gamma-cat

An open data collection and source catalog for gamma-ray astronomy
https://gamma-cat.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
15 stars 17 forks source link

Restructure and clean up of make.py #177

Open pdeiml opened 6 years ago

pdeiml commented 6 years ago

'make.py all' is doing a lot of things which need a bit clean up and restructuring. I start with a description of the new structure followed by a description of necessary clean up later on.

The following list describes the future cli-commands of make.py and a description of them (make.py all' should execute them chronologically):

1) check-input Basically the same as 'make.py checks --step input'. The info.yaml files and their validation should be added to the InputData class in a clean way.

2) input-index-file Basically the same as 'make.py collection --step input-index' which creates the index file of all input data. See https://github.com/gammapy/gamma-cat/pull/175 for more details about that file.

3) catalog Basically the same as 'make.py catalog' which produces the catalog fits-, yaml- and ecsv-file

4) check-catalog Basically the same as 'make.py checks --step catalog' which validates the three catalog files from step 3).

5) collection Basically the same as 'make.py collection --step [sed, lightcurve, lc]' which copies all data from the input folder to $GAMMACAT/docs/data.

6) check-collection Basically the same as 'make.py checks --step collection' which validates the data files of the collection which are produced in step 5).

7) collection-index-file Basically the same as 'make.py collection --step output-index' which creates the index file of all data in $GAMMACAT/docs/data/data.

7) webpage Basically 'make.py webpage' which is under development. See

One can think about joining some of the upper steps but this can be done very easily at the end.

Moreover, there has to be done a clean up in the procedure of these steps. In easy words: The start of step 3) and 5) should be the input-index file and not a scan through the corresponding folder.

cdeil commented 6 years ago

How about adding the files from input/sources to the input index file as well (and then also to the output and output index)? That data is also needed to make the catalog, and if all data needed is available via the index files the scripts and processing will become simpler. I think I would start with that, and only then move on to rewrite the data processing code to always start with the resource index.

pdeiml commented 6 years ago

Do you want to copy the input/sources files to docs/data as well?

cdeil commented 6 years ago

Do you want to copy the input/sources files to docs/data as well?

Yes. It should be scripted in the same way as SED and LC: there should be a class that processes it. It can start out mostly empty, i.e. making a copy, but then over time we'll add scripted fixes and additions before writing out the file, to make the output as uniform as possible.

pdeiml commented 6 years ago

After working a little bit on this IMO one should change point 1) and point 2) in the upper procedure. Because if there is already a input index file it will be easy to scan through its entries and check them against the schemas.

Hence, the new procedure is:

1) input-index-file Basically the same as 'make.py collection --step input-index' which creates the index file of all input data. See https://github.com/gammapy/gamma-cat/pull/175 for more details about that file.

2) check-input Basically the same as 'make.py checks --step input'. The info.yaml files and their validation should be added to the InputData class in a clean way.

3) catalog Basically the same as 'make.py catalog' which produces the catalog fits-, yaml- and ecsv-file

4) check-catalog Basically the same as 'make.py checks --step catalog' which validates the three catalog files from step 3).

5) collection Basically the same as 'make.py collection --step [sed, lightcurve, lc]' which copies all data from the input folder to $GAMMACAT/docs/data.

6) check-collection Basically the same as 'make.py checks --step collection' which validates the data files of the collection which are produced in step 5).

7) collection-index-file Basically the same as 'make.py collection --step output-index' which creates the index file of all data in $GAMMACAT/docs/data/data.

8) webpage Basically 'make.py webpage' which is under development.

pdeiml commented 6 years ago

First PRs handling point 1) and 2):

https://github.com/gammapy/gamma-cat/pull/211 https://github.com/gammapy/gamma-cat/pull/234