gammapy / gamma-cat

An open data collection and source catalog for gamma-ray astronomy
https://gamma-cat.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
15 stars 17 forks source link

Restructure of folders in output/data #216

Closed pdeiml closed 6 years ago

pdeiml commented 6 years ago

This PR handles the ideas from #198.

The new structure in the output folder is similar to the one in the input folder. The only difference is the "missing" info.yaml files which are not necessary in the output folder.

@cdeil If you agree with the first commit, I will do a "update output" commit and fix the errors of travis CI which will occur afterwards

cdeil commented 6 years ago

@pdeiml - if you look at the diff here, it's clear that you're basically doing the same change in three places. This is hard to maintain. Can you please call a make_output_path function instead and call it three times?

pdeiml commented 6 years ago

You are right. I thought about a helper function like this:

def _make_output_path(self, meta, path): if meta['file_id'] != 1: path = path / '{}-{}-{}.ecsv'.format(....) else: path = path / '{}-{}.ecsv'

but this gets compilcated due to the difference in the file ending for lc, seds and datasets and due to the missing "ds" in the filename of the dataset.

The only possibility is to write this helper function only for the ecsv-files but this is not really a gain.

cdeil commented 6 years ago

@pdeiml - I can have a look and try to simplify the code that makes the output path this afternoon. Can't we remove most of that code and just use the input path, or only modify the input path slightly?

pdeiml commented 6 years ago

What do you mean by "that code"?

cdeil commented 6 years ago

What do you mean by "that code"?

The code in gammacat/collection.py that generates the path in the output folder.

cdeil commented 6 years ago

Turns out I didn't have time to work on this today to improve the code. Fixed the test. Merging now.

cdeil commented 6 years ago

Another issue:

$ ls -1 input/data/2016/2016ApJ...819..156B
info.yaml
tev-000049-1.yaml
tev-000049-10.yaml
tev-000049-11.yaml
tev-000049-2.yaml
tev-000049-3.yaml
tev-000049-4.yaml
tev-000049-5.yaml
tev-000049-6.yaml
tev-000049-7.yaml
tev-000049-8.yaml
tev-000049-9.yaml
tev-000049-lc-1.ecsv
tev-000049-sed-1.ecsv
tev-000049-sed-2.ecsv
tev-000049-sed-3.ecsv
tev-000049-sed-4.ecsv
tev-000049.yaml
$ ls -1 output/data/2016/2016ApJ...819..156B
tev-000049-lc-1.ecsv
tev-000049-sed-1.ecsv
tev-000049-sed-2.ecsv
tev-000049-sed-3.ecsv
tev-000049-sed-4.ecsv
tev-000049.yaml

I think the problem is that those YAML files are missing a file_id, and we don't have checks for that.

I'll try to fix everything in master now, but @pdeiml - if you could have another look tomorrow, that would be great!

cdeil commented 6 years ago

I think the output dataset index is also broken ...