Closed AdiOmari closed 7 years ago
@zvili zvi, can you please resend roded @rodedzats an invitation to join the project and assign him to our team.
Done
Just to mention the obvious "editing distance"
Here's a list of implementations of the Levenshtein distance (editing distance)
If you search, you find one in the spartanizer, called LCS
Added weighted line distance-metric and Jaccard distance-metric. For now, we decided to not use the edit distance, as we realized it does not fit for ST comparison in its basic form.
Makes sense. Note that the head of the stack is more important than its bottom, at least for some applications.
@yossigil You are right, and weights in the WeightLinesSTDistancer class the implementation give more weight to lines at the head of the stack.
During our meeting, we decided to continue to work on the stack-trace (ST) comparison algorithm. We need to further improve our algorithm, starting by making the distance normalized (as suggested by yossi). Also we need implement a sorting algorithm that sorts according to the ST distance.
@yonzarecki , please note that the distance can be approximated in SUB linear time.
I am not convinced at all that it is a good idea to start with the metric. I would implement a crude algorithm and then see what's best.
We need to check the effectiveness on actual data, so the milestone is delayed to next weekly meeting (5th)
:+1:
Closing the issue for now. Will re-open it when needed.
Better to start new issues than close/open
Hi,
In this issue we will document the algorithm we use to compare two stack traces and calculate their similarity.
Please make sure you list here all the main points we agree on during our meeting.