Script for gain selection (R0 to R0G), for data taken in ADHF format …

moralejo commented 9 months ago

(i.e. for data taken from July 2023). Still being tested.

codecov[bot] commented 9 months ago

Codecov Report

Attention: 12 lines in your changes are missing coverage. Please review.

Comparison is base (050056a) 72.67% compared to head (39a5027) 72.87%. Report is 2 commits behind head on main.

Files	Patch %	Lines
lstchain/scripts/lstchain_r0_to_r0g.py	90.16%	12 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #1198 +/- ## ========================================== + Coverage 72.67% 72.87% +0.19% ========================================== Files 132 133 +1 Lines 13655 13797 +142 ========================================== + Hits 9924 10054 +130 - Misses 3731 3743 +12 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

moralejo commented 9 months ago

Note: the gain - reduced files seem to be readable and apparently the data inside is correct. But I get the following warning upon reading them with ctapipe_io_lst: WARNING: Unexpected extra padding at the end of the file. This padding may not be preserved when saving changes. [astropy.io.fits.header] @maxnoe : Is this a problem? Can we avoid the warning altogether?

Note: I saw this with the EvB5 file I tested, but not with the EvB6 file (not sure the reason is the different version, though)

morcuended commented 9 months ago

Will this gain selector also work with EVB6 data?

moralejo commented 9 months ago

Will this gain selector also work with EVB6 data?

~~It should, because the format is the same (ADHF)~~. Will try it.

moralejo commented 9 months ago

Will this gain selector also work with EVB6 data?

It should, because the format is the same (ADHF). Will try it.

No, it does not work... 'File' object has no attribute 'CameraConfig' It seems that in EvB6 data it is called CameraConfiguration

moralejo commented 9 months ago

No, it does not work... 'File' object has no attribute 'CameraConfig' It seems that in EvB6 data it is called CameraConfiguration

Fixed now, there are indeed a few differences I was unaware of between EvB5 and EvB6 ADHF data.

maxnoe commented 9 months ago

What is "ADHF" and where does this acronym come from?

moralejo commented 9 months ago

What is "ADHF" and where does this acronym come from?

I am afraid I made it up, sorry. The original wording (in the ELOGs) is "ADH-ZFW", butI am not sure either whether that is official or not.

maxnoe commented 9 months ago

Ah ok. There are three major combinations of software versions I think.

They can be distinguished by looking at the EXTNAME and the PBFHEAD header keywords in the corresponding FITS HDUs.

EVBv5 + "old" ZFW (ZFitsWriter compiled from svn CamerasToACTL repository).

This is the oldest combination and it uses the ProtoR1 protocol buffer definitions with the PBFHEAD keyword set to R1.Event etc.

In [6]: f = fits.open("/fefs/aswg/data/real/R0/20201120/LST-1.1.Run02962.0000.fits.fz")

In [7]: [hdu.name for hdu in f]
Out[7]: ['PRIMARY', 'CameraConfig', 'Events']

In [8]: f['CameraConfig'].header["PBFHEAD"]
Out[8]: 'R1.CameraConfiguration'

In [9]: f['Events'].header["PBFHEAD"]
Out[9]: 'R1.CameraEvent'

EVBv5 + ADH-ZFW ("new" ZFitsWriter compiled from git array-data-handler repository).

This combination was used since mid of last year. It uses the same data model, but with a slightly different PBFHEAD keyword: ProtoR1.Event etc.

In [10]: f = fits.open("/fefs/aswg/data/real/R0/20230811/LST-1.1.Run13880.0000.fits.fz")

In [11]: [hdu.name for hdu in f]
Out[11]: ['PRIMARY', 'CameraConfig', 'Events']

In [12]: f['CameraConfig'].header["PBFHEAD"]
Out[12]: 'ProtoR1.CameraConfiguration'

In [13]: f['Events'].header["PBFHEAD"]
Out[13]: 'ProtoR1.CameraEvent'

EVBv6 + ADH-ZFW

This is using the new official CTAO R1v1 format. The extname of the camera configuration changed, the TelescopeDataStream is new etc.:

In [14]: f = fits.open("/fefs/aswg/data/real/R0/20231215/LST-1.1.Run16120.0000.fits.fz")

In [15]: [hdu.name for hdu in f]
Out[15]: ['PRIMARY', 'DataStream', 'CameraConfiguration', 'Events']

In [16]: f['DataStream'].header["PBFHEAD"]
Out[16]: 'R1v1.TelescopeDataStream'

In [17]: f['CameraConfiguration'].header["PBFHEAD"]
Out[17]: 'CTAR1.CameraConfiguration'

In [18]: f['Events'].header["PBFHEAD"]
Out[18]: 'CTAR1.Event'

I notice the PBFHEAD is inconsistent... will report to dirk and julien

maxnoe commented 9 months ago

The best solution is to treat data with R1.Event and ProtoR1.Event identically to the existing tool and add new code for dealing with the EVBv6 data.

moralejo commented 9 months ago

The best solution is to treat data with R1.Event and ProtoR1.Event identically to the existing tool and add new code for dealing with the EVBv6 data.

I don't understand - what I do now is to check empirically whether I find "CameraConfig" or "CameraConfiguration" in the file - the latter seems to be the case for EvB6 data. Isn't that enough to distinguish the two cases we find since mid-2023? There is no need for the program to work for ~~EvB5 data~~ EvB5+old ZFW data, which will were already gain-selected with your C++ code.

maxnoe commented 9 months ago

There is no need for the program to work for EvB5 data, which will were already gain-selected with your C++ code.

What identifies the new data format most clearly is the content of PBFHEAD, if it starts with R1v1 or CTAR1, it's the new official data format written by EVBv6.

That the extname of the CameraConfiguration HDU changed is more accidental...

The presence of the TelescopeDataStream is also new.

maxnoe commented 9 months ago

There is no need for the program to work for EvB5 data, which will were already gain-selected with your C++ code.

I thought there is, as the C++ code also doesn't work with case 2: EVBv5 plus ADH-ZFW. Isn't that the case?

maxnoe commented 9 months ago

However, as said above, that would be relatively easy to fix, so maybe the easiest is to fix the C++ code for the EVBv5 + ADH-ZFW case and let the code here only deal with EVBv6 aka CTAR1 aka R1v1?

moralejo commented 8 months ago

There is no need for the program to work for EvB5 data, which will were already gain-selected with your C++ code.

I thought there is, as the C++ code also doesn't work with case 2: EVBv5 plus ADH-ZFW. Isn't that the case?

Sorry, you are right. I actually meant "There is no need for the program to work for EvB5 + old ZFW data, which will were already gain-selected with your C++ code".

moralejo commented 8 months ago

However, as said above, that would be relatively easy to fix, so maybe the easiest is to fix the C++ code for the EVBv5 + ADH-ZFW case and let the code here only deal with EVBv6 aka CTAR1 aka R1v1?

This script already works with both, using the "CameraConfig" vs. "CameraConfiguration" approach to identify what type of input data is being read in.

moralejo commented 8 months ago

About this warning: WARNING: Unexpected extra padding at the end of the file. This padding may not be preserved when saving changes. [astropy.io.fits.header]

Is there any way of getting rid of it?

maxnoe commented 8 months ago

Is this happening for data where no event was written maybe? protozfits doesn't properly create / close the files if no event was ever written.

moralejo commented 8 months ago

Is this happening for data where no event was written maybe? protozfits doesn't properly create / close the files if no event was ever written.

No, there were events written out. But this only happened with the EvB v5 file I tested (not for EvB6 data, CTAR1 aka R1v1, which are the ones for which we will use this script)

moralejo commented 8 months ago

@maxnoe can we approve and merge this? If we get it in the next release we can start gain-selecting the data as part of the onsite automatic processing (hence reduce the disk-filling rate)

maxnoe commented 8 months ago

@moralejo We can remove the EVBv5 parts here, right? We will handle all the EVBv5 data with the old, c++ based tool, correct?

It also would be good to test this script (all lines are untested as of now). I can create a small sample with the zfitsrecompressor. Do you have an suitable run number for this?

moralejo commented 8 months ago

@moralejo We can remove the EVBv5 parts here, right? We will handle all the EVBv5 data with the old, c++ based tool, correct?

Well, apparently it works, and I think it does not harm to have it available in python form, even if we are not now using it for those data.

It also would be good to test this script (all lines are untested as of now). I can create a small sample with the zfitsrecompressor. Do you have an suitable run number for this?

I used Run13631.0042 (20230708), but any run should do.

moralejo commented 8 months ago

@maxnoe @morcuended apologies... FYI: I just realized I had some changes not yet pushed (which were actually necessary to make the script work for EvB6). I will try to check your comments to the previous version and see if they still apply.

moralejo commented 8 months ago

@maxnoe @morcuended apologies... FYI: I just realized I had some changes not yet pushed (which were actually necessary to make the script work for EvB6). I will try to check your comments to the previous version and see if they still apply.

And I may have made something weird when merging the changes in main... should they appear explicitly (as they do) in this PR?

moralejo commented 8 months ago

Apart from having a small test sample, I suggest testing it over an entire night.

I agree, but I prefer to make a first release and try it on a full night.

morcuended commented 8 months ago

@maxnoe @morcuended apologies... FYI: I just realized I had some changes not yet pushed (which were actually necessary to make the script work for EvB6). I will try to check your comments to the previous version and see if they still apply.

And I may have made something weird when merging the changes in main... should they appear explicitly (as they do) in this PR?

I think they were merged instead of rebasing your branch with the main. I think it's not a problem anyway

maxnoe commented 8 months ago

They shouldn't appear like this in either case

moralejo commented 8 months ago

They shouldn't appear like this in either case

May have to do with me committing the change in December, but not pushing until today. And between the two I merged the changes in main.

maxnoe commented 8 months ago

I used Run13631.0042 (20230708), but any run should do.

That's an EVBv5 run. What would be a suitable EVBv6 run?

moralejo commented 8 months ago

I tested on Run16102.0010 from 20231214

maxnoe commented 8 months ago

I added now

├── R0
│   └── 20231218
│       ├── LST-1.1.Run16231.0000_first50.fits.fz
│       ├── LST-1.2.Run16231.0000_first50.fits.fz
│       ├── LST-1.3.Run16231.0000_first50.fits.fz
│       └── LST-1.4.Run16231.0000_first50.fits.fz

to the test data

maxnoe commented 8 months ago

Concerning the EVBv6 data, I think there were some where the flatfield events were not tagged correctly due to a misconfiguration and we need to use the flatfield heuristic to tag those.

We could fix the event type for those here?

morcuended commented 8 months ago

Concerning the EVBv6 data, I think there were some where the flatfield events were not tagged correctly due to a misconfiguration and we need to use the flatfield heuristic to tag those.

We could fix the event type for those here?

That was the night of 2023-11-10

moralejo commented 8 months ago

Concerning the EVBv6 data, I think there were some where the flatfield events were not tagged correctly due to a misconfiguration and we need to use the flatfield heuristic to tag those.

We could fix the event type for those here?

I followed the approach "touch as little as possible" that we have followed so far. This would be relatively easy (and anyway, if the heuristic was wrong the gain selection would also be wrong), but it would be confusing to have gain-selected data with fixed trigger tags, and others (all those we reduced in the past, with the C++ tool) in which the original trigger tag is kept.

maxnoe commented 8 months ago

Event type doesn't exist in evbv5 data

moralejo commented 8 months ago

Event type doesn't exist in evbv5 data

Ah ok, that would make it more clear (I think back then we discussed modifying the trigger type).

If we go for that, how would we do it? I guess we do not want to overwrite event_type when it is correct. Should it be a command-line option and then apply it on runs we positively know have issues with the tagging?

moralejo commented 8 months ago

I added now

├── R0
│   └── 20231218
│       ├── LST-1.1.Run16231.0000_first50.fits.fz
│       ├── LST-1.2.Run16231.0000_first50.fits.fz
│       ├── LST-1.3.Run16231.0000_first50.fits.fz
│       └── LST-1.4.Run16231.0000_first50.fits.fz

to the test data

Thanks, I will do the test, but we need also the Drive file and the RunSummary file for that night, can you please include them, in the same directory?

moralejo commented 8 months ago

Tests will fail, I know, I need to do further changes...

maxnoe commented 8 months ago

Thanks, I will do the test, but we need also the Drive file and the RunSummary file for that night, can you please include them, in the same directory?

No problem in principle. On the other hand I wonder why they are needed at all..

moralejo commented 8 months ago

Thanks, I will do the test, but we need also the Drive file and the RunSummary file for that night, can you please include them, in the same directory?

No problem in principle. On the other hand I wonder why they are needed at all..

I use the LST event source to go over the events and get the right event types (e.g. to be safe against "ucts jumps" - I don't really know if these can ever re-appear). So the Run summary is needed (to get the ref values for computing the dragon_time), and the Drive is not but, I think the event source requires it anyway.

maxnoe commented 8 months ago

The jumps part of the code is not used for evbv6 data, so no need for the run summary and you can pass pointing_information=False to the source to not load and interpolate pointing.

moralejo commented 8 months ago

The jumps part of the code is not used for evbv6 data, so no need for the run summary and you can pass pointing_information=False to the source to not load and interpolate pointing.

In principle this works also for EvB5 data with new ZFW... that is why we needed (just in case) the Run summary. But since we no longer need it for EvB5... I guess we can drop all that. Thanks, will do!

moralejo commented 7 months ago

~~... and tests now fail because of non-standard input filename "first50" combined with use of lstchain functions which expect standard name for a raw fits.fz file~~ :'-D Solved

cta-observatory / cta-lstchain

Script for gain selection (R0 to R0G), for data taken in ADHF format … #1198

Codecov Report