asalzburger / sms2021-tra-tra

Repository for SummerStudent 2021 project to learn a (conformal) TRAnsform for TRAcks
2 stars 0 forks source link

First (straight line) hough transform #5

Open asalzburger opened 3 years ago

asalzburger commented 3 years ago

Implement a first straight line based hough transform, e.g. from sci-kit learn:

https://scikit-image.org/docs/0.3/auto_examples/plot_hough_transform.html

Further refinements:

(1) Truth based efficiency function

Given a set of hits found [f], look up how many of the hits are actually from the same particle using the truth association. I.e. every hit in the production file has an identifier, which particle produced it.

Updates:

(2) So far, you are taking the 25 best tracks per event, because you know (falsely) that there are 25 particles produced.

To overcome this, we use selection criteria without looking at truth information:

AndrewSpano commented 3 years ago
AndrewSpano commented 3 years ago

Tasks 4 & 5

Tasks 4 and 5 are linked (actually 4 can't be done without doing 5 first). I uploaded the code in src/utils/metrics.py (in the new pull request). I don't think it makes much sense to post code here, so I will post the results:



ToDo: As mentioned yesterday, the number of different (truth) particles in an event is unknown. Therefore, it is wrong to sample the "top 25" tracks, since in real-life scenarios we will not have this information. In order to fix this, I thought of picking all the tracks where their corresponding bins have a minimum number of book-kept hits inside them. This will probably give us a higher number of tracks than needed, but at least we won't undershoot. Maybe discuss this in the next meeting.

asalzburger commented 3 years ago

Efficiency rate per track we call matching probability, then we use efficiency only for the ensemble of tracks.

AndrewSpano commented 3 years ago

Updates tasks:

noemina commented 3 years ago

That's good!! Can you try to put the matching probability on the x axis? Something like this for example ;) image

asalzburger commented 3 years ago

Exactly, there we can se how we can make a cut.

asalzburger commented 3 years ago

Screenshot 2021-07-12 at 10 18 45

asalzburger commented 3 years ago

This is purely in the (x/y)-plane, you ignore that the hit or the track have a longitudinal (z) component.

That's justified for this example, because the magnetic field is constant in z-axis. -> the helix is a circle in the transverse

AndrewSpano commented 3 years ago
AndrewSpano commented 3 years ago
AndrewSpano commented 3 years ago

Regarding the helical (circular when projected in 2d space) Hough transform:

I started by tackling the one-particle files. Let's take for example this one:

one_p_xy

Since in real-life scenarios we will not know the momentum of a particle for any given hit, I tried to solve this fitting problem without using any extra information. The main idea I came up with (after consulting with Noemi) is to utilize the fact that the circular track must always go through the origin (0, 0). Since finding a circle requires findings 3 variables (a radius and a center), this information above reduces the problem to finding 2 variables that are linearly connected. This means that we can apply the methods from the previous task. For a more detailed explanation of the math:

explanation

The lines in the Hough Space look like this:

one_p_tracks

For the selection of the candidate bins, I selected those that had at least 10-11 hits, since previous analysis showed that there are on average 14 hits per particle. The result I got is:

one_p_fit

I tried to estimate the emittance angle from the circle as the angle of the tangent-line-at-the-origin with x-axis:

one_p_phi

The ground truth value I got (which is computed using: phi = arctan2(py, px)) is slightly different from the estimated one:

fail_phi

I might have to look for bugs in the code.

Now regarding the many-particle files, I picked this one at random:

many_p_xy

The lines in the Hough Space look like this:

many_p_tracks

Running the algorithm will yields the following results:

many_p_fit

By assessing the estimated helical tracks we get the following results:

results

which look kind of promising.

There are some notes I have to make:

AndrewSpano commented 3 years ago
noemina commented 3 years ago

Effect of binning on the evaluation of the crossing point. Screenshot_20210720_093902

asalzburger commented 3 years ago
AndrewSpano commented 3 years ago

Indeed, I had to check the initial*.csv file for the tangent at the origin:

finally_works

asalzburger commented 3 years ago

That's re-assuring