SiLab-Bonn / pixel_clusterizer

A fast, generic, and easy to use clusterizer to cluster hits of a pixel matrix in Python.
MIT License
3 stars 0 forks source link

why is cluster center always shifted by 0.5? #2

Closed laborleben closed 8 years ago

laborleben commented 8 years ago

I don't understand the convention of cluster center (column mean and row mean) shifted by +0.5. Example: single hit at col 434/ row 159 after clusterizer: seed at 434 / 195 cluster mean at 434.5 / 195.5

The discrepancy between cluster seed and cluster mean is even more confusing. It should be one or the other convention.

Example why this is a problem: The code has to know if cluster mean location (+0.5 everywhere) or seed / hit location (pixel center) is used. Otherwise binning and plotting is wrong in all places. The +0.5 makes the hit into the next bin. This makes it hard to exchange cluster mean location with seed or hit location in the code.

DavidLP commented 8 years ago

It is just a binning with centered bins, because the mean cluster position if e.g. pixel 0,0 is hit is pixel_pitch_x / 2 ; pixel_pitch_y / 2. I see that there is an inconsistency for the seed pixel position (seed was never used so far ...). What do you suggest?

laborleben commented 8 years ago

Just collecting some thoughts...this is again a multi layer problem.

Some chips are addressing pixels from 0 to something, other chips from 1 to xyz. Sometimes we agreed on the pixel range of 1 ... xyz for all chips to keep the 0 free and assign it to a "no hit"/"out of range". The reason for this was to keep the unsigned int. Another view is to assign -1 to a "no hit". This is more obvious that this is a no hit.

Possible conventions of the pixel range can either be from x.5 ... x+1.5 or from x.0 ... x+1.0. The x.0 ... x+1.0 is possibly easier to use with metric units (e.g. chip from 0.0 um to xy um). But since all chips deliver integers for pixel addresses, I would stick to the first choice and use for the pixel/cluster center integers and a range from x.5 to x+1.5.

DavidLP commented 8 years ago

Sorry, but I cannot deduce what you think is best :-).

Shall the mean values stay centered at x.5? The advantage is that you do not have to provide the pixel pitch as a parameter to convert to a position.

Shall I add 0.5 to the cluster seed positions?

By the way, the clusterizer cannot work for all devices out of the box without slight changes in the data format in an appropriate way (e.g. hit indeces start at 1). Sure one always has to consider this. E.g. in the testbeam_analysis the converter is meant to do this.

laborleben commented 8 years ago

I'll do an branch and see what is the best solution.

I'd start with pixels ranging from 0 to xyz and keeping pixel address, seed and cluster centered around integers, thus a pixel goes from x.5 ... 1+x.5. A no hit can be assigned to a negative integer (e.g. -1) or "not a number" / "NAN" if floating point is used. When doing the conversion to metric units (e.g. in the telescope software) and converting integers to floating point, a "no hit" can be a "NAN".

DavidLP commented 8 years ago

The clusterizer column/row mean are now not centered anymore. Meaning a cluster with 1 hit at column/row = 1, 2 will have a mean column / mean row = 1, 2