Performance improvement

purpleKarrot commented 11 years ago

The current algorithm to find a match rule is quite slow. It loops linearly through all match rules and checks a) the minimum revision, b) the maximum revision, and c) the prefix. The first rule that matches will be chosen, hence the order is relevant.

Searching for the longest prefix instead of the first prefix can improve both performance (eg by using a radix tree) and usability (repositories.txt may list repositories in any order (currently graph_parallel must come before graph)).

dabrahams commented 11 years ago

I got a start on this in d4fcdb2, but it needs to be debugged and then hooked up to the rest of the program. I think it's pretty close. It would be great if you could carry it forward, Daniel.

dabrahams commented 11 years ago

Bugs are fixed now, I think

dabrahams commented 11 years ago

OK, there's a patricia trie implementation there now, but now I realize that's not enough because it doesn't deal with minimum and maximum revisions. I think augmenting the patricia trie with revision ranges might be possible, but I'm out of time to think about it right now.

ryppl / Boost2Git

Performance improvement #13