Open briochemc opened 6 years ago
I thought about this already for a long time but quite difficult to do in a useful way as unlike in Word there is no access to the editing process (which can sometimes be a good thing), and if, for example, a whole paragraph was moved, and then one or two words changed, it should still appear as a moved paragraph with some edits.
So for now probably not feasible, unfortunately.
It could work on a per-paragraph basis, trying to find 1:1 mappings of the closest corresponding paragraphs and calculating the differences between them.
Thanks for the suggestion. Still not so quick to do in practice (or do you know of an algorithm implemented in perl that does fuzzy differencing of tokenized text?). I have another idea how one could 'fake' such a functionality by looking for exact matches for added/deleted blocks of a certain length, which would probably work in many instances, but even implementing this requires changing several parts of the very core of latexdiff. So not something I will undertake any time very soon
do you know of an algorithm implemented in perl that does fuzzy differencing of tokenized text?
Sorry, I looked into Perl once in 2009 and decided to learn Python instead.
Doing what you said won’t be any more or less fake than what any diff tool does, they’re heuristics by necessity.
Thanks for leaving these hints. It looks like a promising approach but would replace the current diffing algorithm (at least optionally) and thus require quite a lot of coding to implement within the latexdiff context.
Of course! I did not mean to imply it makes it any easier. 🙇
Is there a way to tell latex-diff to figure out when whole sections are moved around?