oli-obk / prettydiff

Side-by-side diff for two files
MIT License
33 stars 16 forks source link

Formatting problems. #1

Closed Pfeil closed 5 years ago

Pfeil commented 5 years ago

Hi :) Nice crate, pretty much what I was searching for. But I have a problem. While the algorithm seems to be working, I seem to have some formatting issues on more complex examples. Especially if I add or remove more than two words. But there has to be at least one more factor. I'm analyzing some web texts so the snippets are shortened and may seem weird ;) In this case, I "transcribed" the text myself, just to make sure there is no encoding problem or something like that.

    println!(
        "diff_words: {}",
        diff_words(
            "und meine Unschuld beweisen!",
            "und ich werde meine Unschuld beweisen!"
        )
    );
    println!(
        "diff_words: {}",
        diff_words(
            "Campaignings aus dem Ausland gegen meine Person ausfindig",
            "Campaignings ausfindig"
        )
    );

Is this a bug or am I doing something wrong? There is also a case with a 2 words insertion where everything is good (well, I think a space insertion there is wrong, so I add this as a third example).

    println!(
        "diff_words: {}",
        diff_words(
            "des kriminellen Videos",
            "des kriminell erstellten Videos"
        )
    );

I'm using Ubuntu Linux and tried with fish, bash and two different terminal emulators. I use the latest stable rust version. diff_chars has the same or at least similar problems. It looks like it knows whats wrong, but formatting breaks.

romankoblov commented 5 years ago

Hello! I fixed first two in 0.2.5 (was formatting problem, color was stripped after color whitespace). Third one is a more complex. It detects diff as: Equal(["des", " "]) Replace(["kriminellen"], ["kriminell"]) Equal([" "]) Insert(["erstellten", " "]) Equal(["Videos"]) Whitespace after "kriminellen" goes between "kriminell" and "erstellten"), so its kinda tricky. But I will think about it.

Pfeil commented 5 years ago

Thank you very much, this already helps a lot :)

romankoblov commented 5 years ago

So, about space issue: I thought about it a lot, it is possible to do some hack for this specific case, but I don't think its worth it, since output is a reasonable (even if I don't like it). Maybe I will find nice solution for this later, but closing for now. Thanks for reporting issue!