Closed HealthyPear closed 3 years ago
Ah.
It occurs to me that perhaps the problem is indeed how ctapipe.core.Container
is built, or better how its metadata class works.
I can't add an attribute to a child container, can't I? Only the fields defined in the dictionary are written at instantiation and that's it?
In protopipe I take care of this case by initializing a dummy
ctapipe.containers.HillasParametersContainer
and record it. Then I attach to each image agood_for_reco
integer variable (0 or 1) to record if this image has been used for direction reconstruction (is this also in the ctapipe DL1 format? I don't remember...if not, I propose to add something similar) and to each shower anis_valid
boolean variable to record if it has been correctly reconstructed (this is analog to ctapipe).
The stage-1 tool initializes an empty ImageParametersContainer
(which, amongst others, includes a HillasParametersContainer
) if the event can not be reconstructed based on some criteria.
https://github.com/cta-observatory/ctapipe/blob/052fe3bde981432be455c00b5dfa70a0ebb8efec/ctapipe/tools/stage1.py#L457-L459
There are no additional validity variables, but you can just discard the nan events, which would match your is_valid
. If you want to keep track of the events used for later stereoscopic analysis tasks, you can save them directly in the relevant container. For example:
https://github.com/cta-observatory/ctapipe/blob/052fe3bde981432be455c00b5dfa70a0ebb8efec/ctapipe/containers.py#L528-L561
Regarding the unit <-> container part:
I personally do not know if we even want the units in the container fields to be anything but constant, but there might be reason to avoid all sorts of conversions slowing us down. You could access the container fields after initializing the container instance to change the units that are written to the file, define new container classes or work on the main class (e.g. adding a units
dict?). Maybe @kosack has an opinion on how to do this.
@HealthyPear In you example you are mixing and confusing class definition (what is in the body after class
) with instance initialization (the __init__
method). This simply is not how containers work.
Creating local variables in __init__
does nothing to that instance. You have to assign them to the instance: self.x = 5
instead of x = 5
.
Fields are class members, so have to be declared in the class definition.
I see two possibilities here.
Change Container / Field to allow what you propose. This is not straight forward and I don't see an easy way to do this and still have most of the nice properties of containers we currently have.
Create a new container HillasParametersDegree
or similar, that has units of degree.
Then, the hillas_parameter
function returns either HillasParameters
or HillasParametersDegrees
depending on the unit of pixel_x / pixel_y
The stage-1 tool initializes an empty
ImageParametersContainer
(which, amongst others, includes aHillasParametersContainer
) if the event can not be reconstructed based on some criteria.
OK that is good
There are no additional validity variables, but you can just discard the nan events, which would match your
is_valid
.
I don't think this does the same think though, because the is_valid
variable is related to the shower, not to the image (at least in protopipe
).
You could have a correctly reconstructed shower which triggered e.g. 5 telescopes, of which only 3 returned useful images.
So what I am proposing is to add something that provides an immediate (meaning "by eye") information on whether that single image was used.
If you want to keep track of the events used for later stereoscopic analysis tasks, you can save them directly in the relevant container.
Yes, I am doing this
Regarding the unit <-> container part: I personally do not know if we even want the units in the container fields to be anything but constant, but there might be reason to avoid all sorts of conversions slowing us down. You could access the container fields after initializing the container instance to change the units that are written to the file, define new container classes or work on the main class (e.g. adding a
units
dict?). Maybe @kosack has an opinion on how to do this.
Well I do not convert units in this case (I can't in fact - meters to degrees it's not covertible). It's more of an initialization thing
@HealthyPear In you example you are mixing and confusing class definition (what is in the body after
class
) with instance initialization (the__init__
method). This simply is not how containers work. Creating local variables in__init__
does nothing to that instance. You have to assign them to the instance:self.x = 5
instead ofx = 5
.Fields are class members, so have to be declared in the class definition.
I am a bit lost, what would be "x" in my case?
I see two possibilities here.
1. Change Container / Field to allow what you propose. This is not straight forward and I don't see an easy way to do this and still have most of the nice properties of containers we currently have.
Yes, I wanted to avoid this
2. Create a new container `HillasParametersDegree` or similar, that has units of degree. Then, the `hillas_parameter` function returns either `HillasParameters` or `HillasParametersDegrees` depending on the unit of pixel_x / pixel_y
This seems a fast and at least temporary solution.
I guess we could rename the current container as HillasParametersContainerCameraFrame
and the new one HillasParametersContainerTelescopeFrame
to avoid confusion
I am a bit lost, what would be "x" in my case?
Going back to your example, I add comments:
class HillasParametersContainer(Container):
# Containers expect field declarations here, as class members
# Thus, this container does not have any fields
def __init__(self, unit=u.m, **fields):
super().__init__(**fields)
# this fails because containers allow only pre-declared fields to be set (via `__slots__`)
# and this container has no field `unit`, it does not have any field.
self.unit = unit
# the following lines only declare local variables in this method and do not affect anything
container_prefix = "hillas"
intensity = Field(nan, "total intensity (size)")
x = Field(nan * self.unit, "centroid x coordinate", unit=self.unit)
y = Field(nan * self.unit, "centroid x coordinate", unit=self.unit)
Please look into the differences between class and instance variables and how each of them are set in python.
Well I do not convert units in this case (I can't in fact - meters to degrees it's not covertible).
Astropy has something called equivalencies to handle such cases, e.g. you can convert frequencies to wavelength with the u.spectral
equivalence:
wavelength = 632 * u.nm
frequency = wavelength.to(u.Hz, equivalencies=u.spectral())
Ah ok, now I understand better.
Then yes, unit
should be a Field
in order to be recognized by the container, but it doesn't make sense.
I confirm that the most viable solution is to create a dedicated container for this.
If we agree then I would add this change directly to PR #1408 , since it is directly related to it.
It is not urgent for me (I needed to know also how to patch protopipe before the next release), so we can wait for other inputs if necessary.
Astropy has something called equivalencies to handle such cases, e.g. you can convert frequencies to wavelength with the
u.spectral
equivalence:wavelength = 632 * u.nm frequency = wavelength.to(u.Hz, equivalencies=u.spectral())
Sorry I didn't see this part!
I didn't know about this feature, indeed there seems to be something very similar (if not the same thing).
Though I thought that containers are read-only. However, even if one managed to open the container and modify the units using this feature, I think that calling the equivalence method and modifying the container would take more computing resources than just define a dedicated container and instantiating it.
Why don't we just change the HillasParameterContainer's defaults to all deg and leave it at that? I don't see any reason to support meters anymore - in the next version we will use the new HillasReconstructor that does it in deg, and we should at the same time modify ctapipe-stage1 to do the same.
All of this conversation is to find a hacky way to support a deprecated way of doing things, I think!
I agree. We should just fully make the switch to calculating it in telescope frame.
However, than we need to discuss about cog names.
These should probably then be: cog_fov_lon
, cog_fov_lat
, instead of x / y.
Also, note that the current units in the Container definition are only ever checked if the user calls hillas_parameters.validate()
, otherwise they are just metadata that are ignored, and are used as a suggestion. No automatic conversion is done or anything. The only case where they are important is that the first time the HDF5TableWriter gets an event, it will assign and check the units of the instance, so if the first event is "NaN" and you didn't explicitly set it to e.g. hillas.length=NaN*u.deg
it would set up the output in meters.
Won't the hdf5 writer fail if subsequent events have the wrong units?
Though one would have to really mess up hard to create this situation (other than a wrong default in the container, which should be fixed now everywhere).
Won't the hdf5 writer fail if subsequent events have the wrong units?
I don't think so - it only checks on the first event, and assumes all subsequent events are the same. In principle it doesn't even look at the units in the FIeld, only at what is assigned in the first event (except for in .validate())
I don't think so - it only checks on the first event, and assumes all subsequent events are the same.
No, it calls to_value(UNIT_OF_FIRST_ROW)
, which will fail.
No, it calls to_value(UNIT_OF_FIRST_ROW), which will fail.
That must have been a new addition: I didn't do it, since that is very slow to do unit conversion every time. We can remove that. The idea was that units and array shapes should never change.
I didn't do it, since that is very slow to do unit conversion every time. We can remove that. The idea was that units and array shapes should never change.
to_value
is only slow if the requested unit is not the unit of the quantity.
I think this is the correct behaviour, as otherwise we might write invalid data.
Ah yes, it adds a col_transform of convert_and_strip_unit
... I forgot. But then it's not really using the Container unit, but the unit of the first event, so I think it is still ok. But of course container.validate()
will fail, I think.
In [5]: a = u.Quantity(np.random.normal(size=10000), u.m)
In [6]: %timeit a.to_value(u.m)
633 ns ± 4.22 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [7]: %timeit a.to_value(u.cm)
7.62 µs ± 86.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Actually, we could just remove the units from the HillasParametersContainer (leave them as None), and then it will all still work. Validate will then not check that the unit matches the first event, and both deg and meters will work properly, as long as you return even "blank" containers in those units (i.e. hillas.length = NaN*u.deg
or hillas.length=NaN*u.m
.)
I think that is the easiest solution: if units can be more than one thing, just don't specify them
We could also use a tuple of units if we want to check that is any of a set of allowed units.
As discussed here #1408, just making the units changeable is not enough, we also need other fields so they correspond correctly to the cordinate system axis names. closing.
Context
In the framework of the pipeline, we should clean and parametrize all calibrated images and always save all information into DL1. This means that sometimes certain cleaned images will result in a failing parametrization which should be recorded. In protopipe I take care of this case by initializing a dummy
ctapipe.containers.HillasParametersContainer
and record it. Then I attach to each image agood_for_reco
integer variable (0 or 1) to record if this image has been used for direction reconstruction (is this also in the ctapipe DL1 format? I don't remember...if not, I propose to add something similar) and to each shower anis_valid
boolean variable to record if it has been correctly reconstructed (this is analog to ctapipe).Now with the changes in PR #1408 , it is possible to clean and parametrize in the
ctapipe.coordinates.TelescopeFrame
, from which all distances in the parametrized image (pixel coordinates, width, and length) are in units of degrees.Proposal and attempted solution
What I propose is to add a default attribute to the initialization of
ctapipe.containers.HillasParametersContainer
which defaults to meters, but that can be set to degrees, akaI attempted to do this in the following way,
but it doesn't work and it returns the following error,
which I thought was the entire point of my modification :)
Maybe I am missing something deep about the parent class
ctapipe.core.Container
? I say this because a simple code snippet works, e.g.returns 1
returns "meters" and 1
returns "degrees" and 1