Closed ctoennis closed 2 months ago
If I understand correctly, what you are basically doing here is computing an event-wise pedestal variance. In the data model (see the original top-level data model document), there are a few places you can put pedestals:
So I think what you are computing is the first one, and you should store it as pedestal information: It is in fact a second measurement of the pedestal and its variance, and could be compared to the other methods (even if the use case you are describing here is for doing star tracking, star tracking is computed from the pedestal variances, which can be over many time scales)
In addition to what has been said I'd suggest to clarify the purpose of this extractor: it is calculating the waveform sample variance, which should be reflected in the name/docstring to communicate that it has a complicated relationship to charge.
The thing we want to calculate is the variance of the waveforms specifically of interleaved pedestal events and not the variance of the pedestal in units of charge.
@kosack : What we get is a sort of image based on the variance of the waveform in each pixel. What we get is not a pedestal value, so putting it in a pedestal container is not the best option i think.
I think it is best to write a container appropriate for these variance images similar to the DL1CameraContainer
I just added a custom continer for the variance images. I also compared the variance calculation using just a plain variance and a variance of integrated chunks of the waveforms using the FixedWindowSum extractor in the plot below.
The x axis shows the number of images used in calculating an average variance value that is shown on the y axis.
Though the methods are generally statistically equivalent the average variance becomes stable with fewer images when using a simple variance over the variance from integrated chunks.
Though the methods are generally statistically equivalent the average variance becomes stable with fewer images when using a simple variance over the variance from integrated chunks.
That's not surprising, since you have just more samples. But you change of what you compute the variance
Hi all, can we restart the discussion here? I made a small update to the extractor to contain timing information that will later be needed in the corresponding subclass of StatisticsExtractor that i am working on with Tjark.
There was a decision needed as to where the variance images will be attached to. Should this find a place somewhere in dl1.event.telescope.pedestal or something similar?
Hi @ctoennis the lint step in the CI is failing due to some typos and wrong formatting.
To check locallly, make sure to install the pre-commit hooks: https://ctapipe.readthedocs.io/en/latest/developer-guide/getting-started.html#installing-ctapipe-in-development-mode
Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs
Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs
i fixed all the isses with the spelling, code checks etc.
i can remove trigger time. But i am not sure how to proceed on the difference to the ImageExtractor API. I could make it that the variance image is stored in a DL1ImageContainer and that it complies with the API or make it a separate class that doesn't inherit from the ImageExtractor.
I think what i generate here is a type of image as i try to look for stars in the camera FOV with them. But if it makes more sense to you it can be separate from the ImageExtractors.
The question is if it is possible to adhere to the API that is expected from the image extractors: if yes, no problem, let it be an image extractor. But that means for example the call above should work and correctly write out the data to the HDF5 file.
If you need the different API, it should be its own thing and we need to think about how to integrate it into the process tool for example.
The question is if it is possible to adhere to the API that is expected from the image extractors: if yes, no problem, let it be an image extractor. But that means for example the call above should work and correctly write out the data to the HDF5 file.
I think what this class does is not the same as an ImageExtractor
, or am I missing something? ImageExtractors
take a "movie" (a set of waveforms per pixel for one event) for a single event and convert them to set of time-independent images that must include the peak time and integral over a window, but can also optionally inlcude any other parameters of the waveform like the pulse width or variabce. Generally, there is only one ImageExtractor that is run at a during processing. That doesn't seem to be what you have here.
The idea of the ImageExtractors was that low-level routines like those used to compute peak time or integral are implemeted as basic functions taking numpy inpus, and the ImageExtractor composes several of those methods into one set of algorithms. So what I would expect here is more like a function to compute the variance, and then that function is added to one or more of the existing ImageExtractors, perhaps as an option. So that then the chosen ImageExtractor generates the required peak time and integral, but also the variance
I think what this class does is not the same as an ImageExtractor, or am I missing something?
You could think of this as an ImageExtractor
that uses the variance of each waveform to produce the image and doesn't produce a peak_time
.
I believe, it is an ImageExtractor, the __call__()
signature is of course wrong here. It should copy the one of the base extractor:
@abstractmethod
def __call__(
self, waveforms, tel_id, selected_gain_channel, broken_pixels
) -> DL1CameraContainer:
Only waveforms
are then used, the other parameters are simply ignored.
The only issue, that has to be resolved, is the return type. DL1CameraContainer
in its current shape contains a number of fields that are not needed in this case (actually, everything except image
).
I'd propose to add an intermediate container:
class CameraImageContainer(Container):
"""
Storage of output of generic camera images (any reduction of waveforms).
"""
image = Field(
None,
"Numpy array of camera image, after waveform extraction."
"Shape: (n_pixel) if n_channels is 1 or data is gain selected"
"else: (n_channels, n_pixel)",
)
class DL1CameraContainer(CameraImageContainer):
"""
Storage of output of camera calibration e.g the final calibrated
image in intensity units and the pulse time.
"""
peak_time = Field(
None,
"Numpy array containing position of the peak of the pulse as determined by "
"the extractor."
"Shape: (n_pixel) if n_channels is 1 or data is gain selected"
"else: (n_channels, n_pixel)",
)
# rest of the DL1CameraContainer fields
class VarianceContainer(CameraImageContainer):
"""
Storage of output of camera variance image.
Stores the variance of waveform in each pixel, composed as an image.
"""
variance_method = Field(
VarianceType.SIMPLE,
"Method by which the variance was calculated"
"This can either be a plain variance"
"or a variance of integrated samples",
type=VarianceType,
)
# I don't think you need anything else here. The trigger time you extract from elsewhere in the event tree.
and change the return type of base ImageExtractor
to CameraImageContainer
If it's just that the VarianceExtractor
doesn't provide a peak_time
, I think that's fine, just leave it None
(the default value).
Adding an additional field to the DL1CameraContainer
that's an "image_type" sounds like a good idea.
I would avoid having extractors that have different return types.
Following a private discussion with @maxnoe , the new Container class is not needed.
@ctoennis just use the DL1CameraContainer
as a return type, setting the extractor type as a container meta keyword. You only need to fill Image
field and leave all others as defaults (None
). In this case, the serialization will be done correctly and only the image that you need will be stored. I.e. something like this (assuming you change VarianceType enum to WAVEFORM):
def __call__(
self, waveforms, tel_id, selected_gain_channel, broken_pixels
) -> DL1CameraContainer:
container = DL1CameraContainer(image=np.nanvar(waveforms, dtype="float32", axis=2))
container.meta["ExtractionMethod"] = str(VarianceType.WAVEFORM)
return container
Since this is also somewhat connected to #1243
I.e. we could introduce a new class e.g. WaveformProperties
that runs after the CameraCalibrator
and computes stuff like variance, FWHM, etc.
Since this is also somewhat connected to #1243
I.e. we could introduce a new class e.g.
WaveformProperties
that runs after theCameraCalibrator
and computes stuff like variance, FWHM, etc.
While this might be a good idea, I'd save it for a separate refactoring PR and merge the current (small) addition as is.
Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs
So, is this request then good to merge?
@ctoennis From my side yes, but we require two approvals and @kosack is on vacation right now. Since he had comments / requests for changes before, I'd like to wait for his approval here.
I'll take a look tomorrow or today hopefully! Might take a bit of time since there are a lot of changes.
i added a test to check that an actual variance is being calculated.
Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs
This pull request is to add a new ImageExtractor subclass to generate images from the variance of waveforms that will be needed for the pointing calibration using stars. This request is related to issue #2542 .