cta-observatory / ctapipe

Low-level data processing pipeline software for CTAO or similar arrays of Imaging Atmospheric Cherenkov Telescopes
https://ctapipe.readthedocs.org
BSD 3-Clause "New" or "Revised" License
62 stars 266 forks source link

Better names for Hillas parameters #1078

Open watsonjj opened 5 years ago

watsonjj commented 5 years ago

In a recent PR it was suggested that the Hillas parameters are renamed to be more self-descriptive (#771). I think this was a good suggestion, so I am creating this issue so that it is not forgotten.

Additionally, in the current docstring for the hillas implementation, the only reference given is:

    Implementation uses a PCA analogous to the implementation in
    src/main/java/fact/features/HillasParameters.java
    from
    https://github.com/fact-project/fact-tools

Which isn't very informative for the meaning of the parameters. Earlier Hillas implementations linked to papers from which the implementation and definitions were described. I suggest that this is also rectified.

watsonjj commented 5 years ago

"Recent PR".... actually almost a year ago!

watsonjj commented 5 years ago

Connected to #737, #150 and #358

watsonjj commented 5 years ago

Is this the correct definition of phi and psi currently in ctapipe?

image

vuillaut commented 5 years ago

Is this the correct definition of phi and psi currently in ctapipe?

image

Yes. To be precise, psi is defined between X-axis and Primary Image Axis, the shower source position could be off the Primary Image Axis and does not enter in the calculation of psi.

maxnoe commented 5 years ago

In Magic/FACT nomenclature, the angle between shower axis and connection of cog and source position would be alpha.

kosack commented 5 years ago

@watsonjj No, the hillas parameters don't know anything about the source position (which is not known - that's something you assume later when doing an analysis, and we'd call the angle between the image axis and the proposed source position alpha as @MaxNoe said).

kosack commented 5 years ago

the current code is:

class HillasParametersContainer(Container):
    intensity = Field(nan, 'total intensity (size)')

    x = Field(nan, 'centroid x coordinate')
    y = Field(nan, 'centroid x coordinate')
    r = Field(nan, 'radial coordinate of centroid')
    phi = Field(nan, 'polar coordinate of centroid', unit=u.deg)

    length = Field(nan, 'RMS spread along the major-axis')
    width = Field(nan, 'RMS spread along the minor-axis')
    psi = Field(nan, 'rotation angle of ellipse', unit=u.deg)

    skewness = Field(nan, 'measure of the asymmetry')
    kurtosis = Field(nan, 'measure of the tailedness')

and I think they are well described, no? We could rename some, but I don't see a major benefit, other than maybe changing psi to orientation_angle, and perhaps going back to centroid_x, centroid_y, centroid_r, centroid_phi, which I think we actually used in an older version of ctapipe...)

kosack commented 5 years ago

The documentation does need a diagram, however!

Also, I think we could remove intensity and create an IntensityParametersContainer that has some more details, like for example intensity, intensity_error_std, intensity_brightest_pixel_1, intensity_brightest_pixel_2, intensity_median, and any other parameters that might be interesting. Some of those are interesting features for machine learning, and some are very useful for e.g. identifying direct Cherenkov light for high-Z nucleii.

watsonjj commented 5 years ago

Okay, my previous plot was misleading by having the source position labelled. Here is a better plot:

image

and I think they are well described, no?

Ah, I didn't think to check the container itself, I just had the parameters names as reference. Looking at the code itself, it is fairly straight forward to figure out their definitions, but I did like a previous suggestion of instead of using generic names like phi and psi, they are named instead as something like cog_polar_angle and rot_angle_ellipse as suggested in #771.

Also, I think we could remove intensity and create an IntensityParametersContainer

Adding things like intensity_error_std, intensity_brightest_pixel_1, intensity_brightest_pixel_2, intensity_median would certainly be great for reconstruction. But I'm unsure about adding an extra level of container inside the hillas container.

There are other parameters that are used for reconstruction, such as time gradient and leakage. But at what point do they belong in a more general ImageParameterizationContainer instead of inside the HillasParametersContainer?

kosack commented 5 years ago

I wouldn't add the intensity container inside Hillas, just as a separate container, and separate calculation independent of the hillas_parameters() func. Of course, including intensity in HillasParameters is still ok, since you have to calculate it anyhow.

From your diagram, it's not so easy to understand that psi=0 when the ellipse is aligned to the x-axis since it looks like it has to intersect with the y=0 line (even though it's equivalent due to trigonometry). It might be clearer just add an x-axis coming from the centroid position, and label it there.

image

kosack commented 5 years ago

Distance can also be ambiguous, since in previous experiments, it was the distance to the center of the camera (though that was also the source position for most whipple observations), e.g. in HESS distance is the core to camera-center. Which is why just giving the vector of the centroid is better (relative always to 0,0 at the center of the camera, and define the source position as something else)

kosack commented 5 years ago

By the way, these were originally named "MomentParameters" since in the original Hillas papers, they are actually quite different (he defined also "miss" and a few others that aren't so useful if the source is not at the center of the camera)

kosack commented 5 years ago

Anyhow, I fully support renaming the phi and psi parameters to one of the suggestions here or in the other issue.

watsonjj commented 5 years ago

image

kosack commented 5 years ago

I like it! Could you put it in the documentation perhaps? At least under the image module API documentation (could add it as a script rather than a hard-coded graphic).

If we want to put everything on this diagram, you could also add the two possible reconstructed points of origin along the shower axis (e.g. the "disp" method", and label "disp" as the centroid-origin distance), and also show an arrow for the skewness (asymmetry)

kosack commented 5 years ago

or we make separate diagrams for line-intersection, disp-method, etc, to avoid it being too complex.

maxnoe commented 5 years ago

Also see https://gist.github.com/MaxNoe/fb1e2042e0d8e1e38b3d2ab4c976cba2

maxnoe commented 5 years ago

hillas_2 hillas_6

kosack commented 5 years ago

I like that view - and if you add the asymmetry parameters it's even more clear that the centroid is not necessarily at the peak

kosack commented 5 years ago

I think using capital letters for "on-the-sphere" angular values, and small for "in a plane" angles could also help distinguish. So for example $\alpha$ and $\Theta$

maxnoe commented 5 years ago

Let's remove conventions as much as possible here. Just give it long proper names the code.

For graphics, that's another matter. But for code and table columns, let's just use long descriptive names

kosack commented 5 years ago

also "reconstructed" and "true" to distinguish estimated from actual values (probably that should also be used for the MC container, e.g. mc_energy is better as true_energy

Yes, agreed to use long descriptive names for the code variables. Standard Math symbols can be mentioned in the Field description if needed.