cta-observatory / ctapipe

Low-level data processing pipeline software for CTAO or similar arrays of Imaging Atmospheric Cherenkov Telescopes

https://ctapipe.readthedocs.org

BSD 3-Clause "New" or "Revised" License

64 stars 268 forks source link

Change R1- and DL0-waveforms shape to (n_channels, n_pixels, n_samples) #2529

Closed Hckjs closed 5 months ago

Hckjs commented 6 months ago

This PR basically adds three (somehow dependent) features:

1) Changing the R1 and DL0 waveforms shape to be always (n_channels, n_pixels, n_samples).

2) Implementing 1) in such a general way (and two lines of code) allows the ImageExtractors to handle not gain selected data (by passing selected_gain_channel=None)

3) To test these i've adapted the WaveformToymodel to be able to create 2 gain waveforms. All extractors are now tested for different camera types (including different number of gain channels)

Things that probably still need to be discussed:

I don't think it will be needed, but for now the TwoPassWindowSum does not support for 2 gain channels
There was already some discussion about splitting the DL1CameraContainer considering the two different ndims of image and peak_time
With the current API of the GainSelector it cannot distinguish between already gain selected data and data that originally consist of just one gain channel. We may pass something like camera.readout.n_channels?

Closes #1836

Thanks already to @TjarkMiener for the discussion and input!

ctao-dpps-sonarqube[bot] commented 6 months ago

Failed

5.48% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

3 Issues

0 Bugs
0 Vulnerabilities
3 Code Smells

Coverage and Duplications

96.30% Coverage (92.60% Estimated after merge)
5.48% Duplicated Code (0.80% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

maxnoe commented 6 months ago

I don't think it will be needed, but for now the TwoPassWindowSum does not support for 2 gain channels

I tend agree. At least for now we can leave this unsupported. Just make sure there is a nice error message in case someone tries to apply it to two-gain data

codecov[bot] commented 6 months ago

Codecov Report

Attention: Patch coverage is 95.37954% with 14 lines in your changes are missing coverage. Please review.

Project coverage is 92.67%. Comparing base (e2a848e) to head (cfad23c). Report is 20 commits behind head on main.

:exclamation: Current head cfad23c differs from pull request most recent head e37bb41. Consider uploading reports for the commit e37bb41 to get more accurate results

Files	Patch %	Lines
src/ctapipe/image/extractor.py	73.80%	11 Missing :warning:
src/ctapipe/calib/camera/calibrator.py	91.66%	1 Missing :warning:
src/ctapipe/image/invalid_pixels.py	95.23%	1 Missing :warning:
src/ctapipe/io/hdf5eventsource.py	66.66%	1 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #2529 +/- ## ========================================== + Coverage 92.66% 92.67% +0.01% ========================================== Files 232 231 -1 Lines 20220 20292 +72 ========================================== + Hits 18736 18805 +69 - Misses 1484 1487 +3 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

maxnoe commented 6 months ago

I guess with this we break the reading-in of R0, R1 and DL0 data written into HDF5 files? I.e. this is a data model change that requires changes in the compatibility table of the HDF5EventSource?

ctao-dpps-sonarqube[bot] commented 6 months ago

Failed

5.51% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

3 Issues

0 Bugs
0 Vulnerabilities
3 Code Smells

Coverage and Duplications

96.20% Coverage (92.60% Estimated after merge)
5.51% Duplicated Code (0.80% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

TjarkMiener commented 6 months ago

I guess with this we break the reading-in of R0, R1 and DL0 data written into HDF5 files? I.e. this is a data model change that requires changes in the compatibility table of the HDF5EventSource?

At the moment (lst-cluster down), I can only test it with the LST-1 HDF5 files of R1 calibration data, which comes with two gains. This is working fine for me with the HDF5EventSource. I only have an (unrelated) issue with the coordinate frame when filling the array pointing in the HDF5EventSource. I will have a look at the DL0 LST-1 ACADA data once the lst-cluster is running.

File "/Users/tjarkmiener/calibration/2529/ctapipe/src/ctapipe/io/hdf5eventsource.py", line 632, in _generator
  self._fill_array_pointing(data)
 File "/Users/tjarkmiener/calibration/2529/ctapipe/src/ctapipe/io/hdf5eventsource.py", line 716, in _fill_array_pointing
  raise ValueError(f"Unsupported pointing frame: {frame}")
ValueError: Unsupported pointing frame: CoordinateFrameType.UNKNOWN

maxnoe commented 6 months ago

At the moment (lst-cluster down), I can only test it with the LST-1 HDF5 files of R1 calibration data, which comes with two gains. This is working fine for me with the HDF5EventSource.

That case will work fine, the issue will be reading single-gain data written with older versions which will result in the 2d shape that is written into the file not the 3d shape expected now

maxnoe commented 6 months ago

I will have a look at the DL0 LST-1 ACADA data once the lst-cluster is running.

I don't understand the relevance of DL0 ACADA (zfits) data in the discussion about the HDF5 event source...

TjarkMiener commented 6 months ago

That case will work fine, the issue will be reading single-gain data written with older versions which will result in the 2d shape that is written into the file not the 3d shape expected now

Yes, sorry! This is what I meant. I accidentally thought that this applies to DL0 ACADA data. However, as you said it's stored (of course) in zfits and not HDF5!

mexanick commented 6 months ago

That case will work fine, the issue will be reading single-gain data written with older versions which will result in the 2d shape that is written into the file not the 3d shape expected now

Is there a good reason to care about backward compatibility here? I mean, do you want a workaround to be implemented to remain compatible with files written by previous versions or just update the is_compatible method of HDF5EventSource to provide a correct compatibility info?

Edit: if we don't care about backwards compatibility, one line change here shall be enough, right?

TjarkMiener commented 6 months ago

I have tested it with an old LST-1 sim file in hdf5, where the R1 waveform data is stored with 2 dims, i.e. shape of waveform: (1855,40). In case you would like to have backward compatibility, I did the following: I expanded the dimension here of the waveforms:

                    data.r1.tel[tel_id] = next(waveform_readers[key])
                    if data.r1.tel[tel_id]["waveform"].ndim < 3:
                        data.r1.tel[tel_id]["waveform"] = np.expand_dims(data.r1.tel[tel_id]["waveform"], axis=0)

Since the r1.pixel_status is None, I introduced a check here

         dl0_pixel_status = None
         if r1.pixel_status is not None:
             dl0_pixel_status = r1.pixel_status.copy()
             # set dvr pixel bit in pixel_status for pixels kept by DVR
             dl0_pixel_status[signal_pixels] |= PixelStatus.DVR_STORED_AS_SIGNAL
             # unset dvr bits for removed pixels
             dl0_pixel_status[~signal_pixels] &= ~np.uint8(PixelStatus.DVR_STATUS)

I'm having some trouble with self.reference_location (this is "None" with my file) in SubarrayDescription which I don't know how to solve. Any ideas? I hope it can be useful!

maxnoe commented 6 months ago

Is there a good reason to care about backward compatibility here? I mean, do you want a workaround to be implemented to remain compatible with files written by previous versions or just update the is_compatible method of HDF5EventSource to provide a correct compatibility info?

I don't have a strong opinion here, but we need to do one of two things:

Raise a nice error in case we encounter old data we do not support anymore
Add a little bit of code to reshape the 2d data into the now expected 3d shape

TjarkMiener commented 6 months ago

I only have an (unrelated) issue with the coordinate frame when filling the array pointing in the HDF5EventSource.

Due to some version incompatibilities, the array pointing information was not written to the HDF5 file. When reading the incomplete/corrupted file the HDF5EventSource complains, which is the correct behavior. Therefore, the quoted sentence can be ignored.

TjarkMiener commented 6 months ago

Raise a nice error in case we encounter old data we do not support anymore

I'm in favor of raising a nice error. I don't think there are many (and crucial) R1 or DL0 datasets produced in HDF5 following the 2d waveform shape. Most users are only storing DL1b and/or DL1a. (I actually have a R1 production of LST-1 sims for DL studies. However, it can be easily reprocessed with this PR!)

ctao-dpps-sonarqube[bot] commented 6 months ago

Failed

5.39% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

3 Issues

0 Bugs
0 Vulnerabilities
3 Code Smells

Coverage and Duplications

96.30% Coverage (92.60% Estimated after merge)
5.39% Duplicated Code (0.80% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 6 months ago

Failed

4.95% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

3 Issues

0 Bugs
0 Vulnerabilities
3 Code Smells

Coverage and Duplications

97.50% Coverage (92.60% Estimated after merge)
4.95% Duplicated Code (0.80% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 6 months ago

Failed

4.89% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

3 Issues

0 Bugs
0 Vulnerabilities
3 Code Smells

Coverage and Duplications

97.50% Coverage (92.00% Estimated after merge)
4.89% Duplicated Code (0.80% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

maxnoe commented 6 months ago

The docs failure is due to an unclosed tables file, I think the one that is opened on line 222 of examples/core/table_writer_reader.py.

Either you close the file, or you add an exception for that warning into the docs conf

ctao-dpps-sonarqube[bot] commented 6 months ago

Failed

4.89% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

3 Issues

0 Bugs
0 Vulnerabilities
3 Code Smells

Coverage and Duplications

97.50% Coverage (92.00% Estimated after merge)
4.89% Duplicated Code (0.80% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 6 months ago

Failed

4.99% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

3 Issues

0 Bugs
0 Vulnerabilities
3 Code Smells

Coverage and Duplications

97.50% Coverage (92.00% Estimated after merge)
4.99% Duplicated Code (0.80% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 5 months ago

Passed

Analysis Details

5 Issues

0 Bugs
0 Vulnerabilities
5 Code Smells

Coverage and Duplications

97.10% Coverage (92.70% Estimated after merge)
1.00% Duplicated Code (0.70% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 5 months ago

Failed

7.09% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

5 Issues

0 Bugs
0 Vulnerabilities
5 Code Smells

Coverage and Duplications

96.50% Coverage (92.70% Estimated after merge)
7.09% Duplicated Code (0.80% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 5 months ago

Passed

Analysis Details

5 Issues

0 Bugs
0 Vulnerabilities
5 Code Smells

Coverage and Duplications

96.50% Coverage (92.70% Estimated after merge)
2.13% Duplicated Code (0.70% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 5 months ago

Passed

Analysis Details

5 Issues

0 Bugs
0 Vulnerabilities
5 Code Smells

Coverage and Duplications

96.50% Coverage (92.70% Estimated after merge)
2.13% Duplicated Code (0.70% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 5 months ago

Failed

8.62% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

5 Issues

0 Bugs
0 Vulnerabilities
5 Code Smells

Coverage and Duplications

97.20% Coverage (92.70% Estimated after merge)
8.62% Duplicated Code (0.80% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 5 months ago

Failed

9.76% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

7 Issues

0 Bugs
0 Vulnerabilities
7 Code Smells

Coverage and Duplications

97.80% Coverage (92.80% Estimated after merge)
9.76% Duplicated Code (0.90% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

maxnoe commented 5 months ago

The duplicated lines are in code we just "touched" not actually new duplicated code.

Since it is test code, I'd rather not do a large refactoring here that might change the tests inadvertently

mexanick commented 5 months ago

@kosack @maxnoe @Tobychev is there anything apart from the issue with the documentation that holds this PR?

maxnoe commented 5 months ago

Not from my side, docs are fixed now on main, please rebase

ctao-dpps-sonarqube[bot] commented 5 months ago

Failed

3.58% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

7 Issues

0 Bugs
0 Vulnerabilities
7 Code Smells

Coverage and Duplications

97.30% Coverage (92.80% Estimated after merge)
3.58% Duplicated Code (0.70% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

ctao-dpps-sonarqube[bot] commented 5 months ago

Failed

3.58% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

7 Issues

0 Bugs
0 Vulnerabilities
7 Code Smells

Coverage and Duplications

97.30% Coverage (92.80% Estimated after merge)
3.58% Duplicated Code (0.70% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube