cta-observatory / ctapipe

Low-level data processing pipeline software for CTAO or similar arrays of Imaging Atmospheric Cherenkov Telescopes
https://ctapipe.readthedocs.org
BSD 3-Clause "New" or "Revised" License
65 stars 269 forks source link

PedestalIntegrator does not work due to expected image shapes after #2529 #2631

Open maxnoe opened 3 weeks ago

maxnoe commented 3 weeks ago

Discussed in https://github.com/cta-observatory/ctapipe/discussions/2630

Originally posted by **cbadams2** October 29, 2024 I may be missing something, but I've been struggling to find clear directions for how best to populate the `PedestalContainer` for an event - specifically the `charge_std` field. Of course, this is a population-wise number, so it needs to be generated from a number of samples. My naive attempt was to do: ``` ped_calculator = PedestalIntegrator( subarray=source.subarray, charge_product="FixedWindowSum", sample_size=10000, tel_id=2, ) source = SimTelEventSource(filename, focal_length_choice='EQUIVALENT', max_events=10001) for event in source: calib = CameraCalibrator(subarray=source.subarray) calib(event) ped_calculator.calculate_pedestals(event) ``` but I immediately encounter an error with the following (truncated) traceback. ``` --------------------------------------------------------------------------- AxisError Traceback (most recent call last) Cell In[154], line 5 3 calib = CameraCalibrator(subarray=source.subarray) 4 calib(event) ----> 5 ped_calculator.calculate_pedestals(event) 6 print(event) File ~/anaconda3/envs/ctapipe/lib/python3.12/site-packages/ctapipe/calib/camera/pedestals.py:264, in PedestalIntegrator.calculate_pedestals(self, event) 261 if not dl1.is_valid: 262 return False --> 264 self.collect_sample(dl1.image, pixel_mask) 266 sample_age = (trigger_time - self.time_start).to_value(u.s) 268 # check if to create a calibration event File ~/anaconda3/envs/ctapipe/lib/python3.12/site-packages/ctapipe/calib/camera/pedestals.py:303, in PedestalIntegrator.collect_sample(self, charge, pixel_mask) 300 """Collect the sample data""" 302 good_charge = np.ma.array(charge, mask=pixel_mask) --> 303 charge_median = np.ma.median(good_charge, axis=1) 305 self.charges[self.n_events_seen] = charge 306 self.sample_masked_pixels[self.n_events_seen] = pixel_mask ``` This is unsurprising to me, because the shape of the dl1 image I'm working with will be `(11328,)`, so won't have the dimension corresponding to axis=1. Am I doing this at all right? Is this a bug? Any help would be appreciated.
maxnoe commented 3 weeks ago

Hi @cbadams2 ,

Thanks for your interest in using the calibration code in ctapipe.

Before I answer what you need to do to fix the exception you mentioned, let me say that this part of the code is relatively old, mostly untested and has significantly changed downstream in the only place we know it has been used, the calibration code for LST-1.

It is currently being rewritten, and we are making progress in doing so, see recent PRs:

2554, #2604, #2609 and the discussions there-in.

The final step of actually computing the calibration coefficients will probably not be in ctapipe itself, at least not for some time, but in the codebase of the CTAO calibpipe, which is under development here: https://gitlab.cta-observatory.org/cta-computing/dpps/calibrationpipeline/calibpipe

That being said, you encountered a bug that I think stems from recent changes of the expected waveform shape we made in https://github.com/cta-observatory/ctapipe/pull/2529

So I'll transfer this from discussions to issues.

mexanick commented 3 weeks ago

can it be because the SimTelEventSource returns a wrong waveform shape (not augmented by a new dimension) in case the number of channels was 1? We have this workaround for the HDF5 event source, but I can't find a similar one for SimTelEventSource...

https://github.com/cta-observatory/ctapipe/blob/fa4dac63bd1ed88b833f16f385fe4e94d397564a/src/ctapipe/io/hdf5eventsource.py#L655-L658

@cbadams2, can you provide a short version of your simulation file (a few events) for testing?

maxnoe commented 3 weeks ago

can it be because the SimTelEventSource returns a wrong waveform shape (not augmented by a new dimension) in case the number of channels was 1?

No, that's not the case:

In [1]: from ctapipe.io import EventSource

In [2]: s = EventSource("dataset://gamma_prod5.simtel.zst")

In [3]: e = next(iter(s))

In [4]: e.trigger.tels_with_trigger
Out[4]: array([  9,  14, 104], dtype=int16)

In [5]: e.r1.tel[14].waveform.shape
Out[5]: (1, 1764, 25)
maxnoe commented 3 weeks ago

The issue is that the SimTelEventSource sets selected_gain_channel, which for the ImageExtractor is the sign to remove the outer dimension, since gain selection has already happened.

The solution then as two aspects, which we should probably implement both:

a) Perform no gain selection on the SimTelEventSource b) Make the calibration calculators work on gain-selected calibration events

a) Is already implemented. The default is to not gain-select calibration events, so if your are only passing calibration events, it should already work. If you really want to pass physics events for some reason to the pedestal calculator, you need to set select_gain=False for the SimTelEventSource, and indeed this works:

def test_integrator_simtel():
    from ctapipe.io import EventSource
    from ctapipe.calib.camera.pedestals import PedestalIntegrator

    tel_id = 14
    with EventSource("dataset://gamma_prod5.simtel.zst", allowed_tels={tel_id}, select_gain=False) as source:
        ped_calculator = PedestalIntegrator(
            subarray=source.subarray,
            charge_product="FixedWindowSum",
            sample_size=10000,
            tel_id=tel_id,
        )

        n_events = 0
        for event in source:
            ped_calculator.calculate_pedestals(event)
            n_events += 1

        assert n_events > 0

Could you provide your input file where this fails?

maxnoe commented 3 weeks ago

Here is the example for LST data that contains pedestal events, with a filter for the event type:

def test_integrator_simtel():
    """Test the integrator works on actual simulation events"""
    from ctapipe.io import EventSource
    from ctapipe.calib.camera.pedestals import PedestalIntegrator

    # test on actual calibration events from LST
    tel_id = 1
    input_url = "dataset://lst_prod3_calibration_and_mcphotons.simtel.zst"
    with EventSource(input_url, focal_length_choice="EQUIVALENT", skip_calibration_events=False) as source:
        ped_calculator = PedestalIntegrator(
            subarray=source.subarray,
            charge_product="FixedWindowSum",
            sample_size=10000,
            tel_id=tel_id,
        )

        n_events = 0
        for event in source:
            if event.trigger.event_type != EventType.SKY_PEDESTAL:
                continue

            ped_calculator.calculate_pedestals(event)
            n_events += 1

        assert n_events > 0
cbadams2 commented 3 weeks ago

Thanks for the additional comments. My test dataset is dataset://proton_40deg_0deg_run2083___cta-prod3-sct_desert-2150m-Paranal-SCT.simtel.gz. I see it does seem to work when I run:

source = SimTelEventSource(filename, focal_length_choice='EQUIVALENT', max_events=10, select_gain=False,  skip_calibration_events=False, allowed_tels={2})
for event in source:
    calib = CameraCalibrator(subarray=source.subarray)
    calib(event)

    ped_calculator.calculate_pedestals(event)

I will admit to being quite naive here and not knowing what exactly this this is doing. It's difficult to tell how the pedestal is being extracted from the waveforms, and which pixels it is pulling them from. Will it be contaminated by the signal portion of the waveforms/images?

In each event iteration, I see that event.mon.tel[2].pedestal.charge_std.shape maintains the shape (1, 11328). I suppose this must mean that the charge_std is an iteratively calculated variable that calculates the standard deviation of each pixel's charge distribution as events are iterated over.. Is the per pixel charge distribution retained at all in this process?

maxnoe commented 3 weeks ago

dataset://proton_40deg_0deg_run2083___cta-prod3-sct_desert-2150m-Paranal-SCT.simtel.gz

This dataset does not seem to exist on the ctapipe test server. Are you using your own resources repository?

I will admit to being quite naive here and not knowing what exactly this this is doing. It's difficult to tell how the pedestal is being extracted from the waveforms, and which pixels it is pulling them from.

The PedestalIntegrator will extract an image for each event using its own ImageExtractor and collect these images.

Then, every sample_size events (or time based) it will compute statistics on the collected events for each pixel.

Without knowing what kind of file you use, I'd expect it to only contain physics events, which will mean that you compute the charge std including the pixels that contain actual Cherenkov light.