asalzburger / sms2021-tra-tra

Repository for SummerStudent 2021 project to learn a (conformal) TRAnsform for TRAcks
2 stars 0 forks source link

Duplicate Removal #26

Open AndrewSpano opened 3 years ago

AndrewSpano commented 3 years ago
AndrewSpano commented 3 years ago

After doing some testing, I found the following patterns:

asalzburger commented 3 years ago

Do you have some plots for the purity?

AndrewSpano commented 3 years ago

Purity vs count (how many tracks had purity falling in the ranges 0 - 0.1, 0.1 - 0.2, etc..)

purity-vs-count-plot

AndrewSpano commented 3 years ago

Regarding the "deterministic" approach, while I was on the plane to Greece, I had a very stupid idea: For every x-y bin selected, run the r-z Hough Transform for -only- the hits inside that bin. This will help purify the hits. The idea was inspired by this plot:

rz-hits

The result was good:

purified-rz-hits

The Purity vs count plot now looks like this:

purified-purity

The performance (for this one event) can be assess by the metrics:

  1. Just purification

    purified-metrics

  2. Purification + duplicate-removal-1 algorithm

    purified-dup-removal1-metrics

  3. Purification + duplicate-removal-2 algorithm

    purified-dup-removal2-metrics

So I tried doing it for all the events in the with-material and non-homogenous-magnetic-field dataset. The results I got were pretty surprising:

average-metrics

By analyzing later I saw for every event, at most 1 or 2 particles are not identified. This could due to the approximation error. Either ways, for more than half the events, the efficiency is 1.0, which should be good enough. I will postpone the Neural Network development as this approach is already yielding very good results.