cessen / ropey

A utf8 text rope for manipulating and editing large texts.
MIT License
1.04k stars 46 forks source link

Performance issue with the redundancy operations #59

Closed yyzdtccjdtc closed 2 years ago

yyzdtccjdtc commented 2 years ago

https://github.com/cessen/ropey/blob/d9c841b2ef4ae7ea97c13a99fc26b36ae14ebe12/examples/search_and_replace.rs#L156

For the next function here, by change the linear search to the Boyer-Moore string search algorithm and then it can skip some unnecessary char comparison, then reduce the memory operations and improve the performance.

According to my test it will get a 96.5% speedup (12.2s to 6.22s).

Here's the modified code:

fn next(&mut self) -> Option<(usize, usize)> {
        let search_char = self.search_pattern.chars();
        while let Some(next_char) = self.char_iter.next() {//char_iter redundancy
            self.cur_index += 1;
            if let Some(last_char) = self.char_iter.nth(self.search_pattern_char_len - 2){
                if last_char != search_char.clone().last().unwrap(){
                    let n_place = self.search_pattern.rfind(last_char);
                    if n_place == None{// if the last chat is not matched and it is not in the search_pattern, the we can skip the whole length
                        // self.possible_matches.clear();
                        for n in 0..(self.search_pattern_char_len - 1){
                            self.char_iter.next();
                            self.cur_index += 1;
                        }
                        continue;
                    }else{// if the last char is in the search_pattern, we can skip some of the letters.
                        for n in 0..(self.search_pattern_char_len - n_place.unwrap() - 1){
                            self.char_iter.next();
                            self.cur_index += 1;
                        }
                        continue;
                    }
                }
                // It is matched with the last char, so we start to compare them
                // first match the first char, if not match, go to next loop

                if next_char != search_char.clone().next().unwrap(){
                    continue;
                }
                let mut if_match = true;
                for i in 0..(self.search_pattern_char_len - 3){
                    let next_last_char = self.char_iter.nth(self.search_pattern_char_len - 3 - i);
                    if next_last_char != search_char.clone().nth(self.search_pattern_char_len - 2 - i){
                            // self.char_iter.next();
                            // self.cur_index += 1;
                            if_match = false;
                            break;
                    }
                }
                if if_match{
                    let char_match_range = (
                        self.cur_index - 1,
                        self.cur_index + self.search_pattern_char_len - 1,
                    );
                    return Some(char_match_range);
                }else{
                    continue;
                }
            } else {
                break;
            }

        }

        None
    }

Hope this information helps!

cessen commented 2 years ago

Thanks for the info!

I'm already aware of Boyer-Moore, but intentionally used naive string search instead to keep things simple. The example isn't meant to demonstrate search algorithms, but rather is meant to demonstrate how you would integrate any search algorithm (be it Boyer-Moore, a regex engine, or whatever) into an efficient search-and-replace function for Ropey.