ewengillies / track-finding-yandex

COMET Tracking : Machine Learning Approaches
10 stars 4 forks source link

Getting the Sterometry Correct #5

Open ewengillies opened 9 years ago

ewengillies commented 9 years ago

Since the tracks are often offset in even and odd layers due to stereometry of the wire array. This poses two problems:

  1. The reweighted hough transform normally fits to one layer only. The output of the GBDT can be seen here, where the the blue/red circles are the signal and background respectively sized by their GBDT output value.

algorithm_ex_1

The effect of transforming, reweighting and inverting can be seen here, where the orange dots are the reweighted transform, the green circle is a visualization of the inverse hough transform to wire space, and the blue/red circles are the signal and background respectively sized by their outputs of the inverse hough transform. The full set of signal hits is also included using the empty blue circles.

algorithm_ex_3

What is clear here is that the inverse hough transform excels at cutting out noise that is far away from the track, which is half the battle. The other half is picking up missed signal, which implies the suppression seen in even layers maybe harmful. Noting that the hough transform is optimized against mislabelled events, we should decide which is more important, filtering out false positives or false negatives and weight the cost accordingly.

  1. We cannot recover any information about neighbours in adjacent layers. While such information is most likely highly correlated with LR neighbour features, we tend to pick up false positives as background that are to the left or right of signal hits. Using adjacent layers could cut this down to a much smaller set then "all hits to the left or right of a signal." The obvious answer here is to skip a layer, and compare even layers to the next even layer, but by this time, the particle has moved significantly in theta. If we then compare points offset significantly in theta between every other layer, we are moving into the hough transform territory of long-range structure. It could be really useful to find a way to use adjacent layer information more effectively.
ewengillies commented 9 years ago

Perhaps if we complete #6 first, we can complete the final algorithm without the hough transform element, and reweight the points according to this output, instead of the output of the GBDT alone.