oli-obk / prettydiff

Side-by-side diff for two files
MIT License
33 stars 15 forks source link

slow diff_chars #21

Open chenyan2002 opened 1 month ago

chenyan2002 commented 1 month ago

Diffing a large text, e.g., 27k of text, diff_chars seems to take a very long time, whereas diff_lines finishes instantly. Is there a way to speed up the diff_chars?

oli-obk commented 1 month ago

Just to be sure: were you running in release mode?

Did you compare processing the resulting iterator, or did you just time the initial function call?

romankoblov commented 1 month ago

LCS has O(N^2) complexity, so for chars it would be O(27k^2), but there is significantly less lines.

chenyan2002 commented 1 month ago

Running in release mode, and time the initial function call with the same content: let diff = prettydiff::diff_chars(&content, &content); println!("{diff}");. diff_chars takes 5s, while diff_lines takes 0.2s.

Yeah, if diff_chars is using LCS, that's expected. If we assume the diff is small, not sure if there is a more efficient algorithm.

oli-obk commented 1 month ago

So... is there anything actionable here? Would you have benefited from some documentation somewhere about the algorithms being used?

chenyan2002 commented 1 month ago

The only action item I can think of is to document diff_chars that it's a O(n^2) algorithm. People may blindly use that function to process large text without realizing its time complexity.