pascalkuthe / imara-diff

Reliably performant diffing
Apache License 2.0
106 stars 9 forks source link

Just had to say 'thanks' ๐Ÿ™ #2

Closed Byron closed 1 year ago

Byron commented 1 year ago

Just now I have integrated imara-diff into gitoxide, replacing similar for a stunning 2x speedup when running ein t h -l. Thanks to your work gitoxide can excel even with diff performance, a previously undercooked feature that I cobbled together quickly. With imara-diff I have the feeling that I can built on top of a crate that acknowledges git and sees it as baseline, along with the desire to improve on it. It's probably what I would have wanted to have in git-diff verbatim if there would have been enough time (butโ€ฆ I am also glad I didn't have to implement it myself ๐Ÿ˜…).

Thank you sooo much ๐Ÿ™

pascalkuthe commented 1 year ago

Awesome :rocket: I was planning to send a PR at some point but you were faster :D I initially tried to use ein -l as a benchmark while developing imara-diff and actually had local changes similar to your PR. However multithreading + io + tree diff caused too much interference so I wrote my own benchmark where the treediff is performed upfront. I also wasn't sure how you would want to expose the API so I just sort of let it sit.

I just noticed crate-status.md still refers to similar. Might want to update that (and mark that task as done?):

  • lines
    • [ ] Simple line-by-line diffs powered by the similar crate.

What are your thoughts on an alternative to git diff in gix (or ein)? I think most of whats needed from gitoxide is already there and the only thing imara-diff is potentially missing is a special purpose algorithm for performing the word diff inside the hunks (git doesn't use its normal diff algorithm for word diffs and I still need to dig into what they are using instead).

Byron commented 1 year ago

However multithreading + io + tree diff caused too much interference so I wrote my own benchmark where the treediff is performed upfront. I also wasn't sure how you would want to expose the API so I just sort of let it sit.

That makes perfect sense! If it comes up again and makes a different, gix should respect the global -t flag to configure the amount of threads used.

I just noticed crate-status.md still refers to similar. Might want to update that (and mark that task as done?):

Thanks for the pointer, a clear oversight which is now fixed via shoehorning it into this PR.

What are your thoughts on an alternative to git diff in gix (or ein)?

gix diff would be fantastic to have! If you wanted to give it a shot, it can be whatever you think is right, so you can expect fast turnarounds. gix is like a developer platform to be able to try new features outside of test sandboxes, and despite the aspiration for the sub-commands to be correct, they don't have to be fancy or groundbreaking. Usually this makes for a fun time when implementing those.

One day, with all the lessons learned in gix and everywhere else, I hope to boil it all down to the 'optimal' workflow that is then poured into ein, probably not even as ein diff but as part of the natural development workflow that it guides people through.

โ€ฆand the only thing imara-diff is potentially missing is a special purpose algorithm for performing the word diff inside the hunks (git doesn't use its normal diff algorithm for word diffs and I still need to dig into what they are using instead).

I am absolutely looking forward to it and think that word diffing would indeed be a good candidate to try out via gix diff, along which a decent API could also be developed.

pascalkuthe commented 1 year ago

gix diff would be fantastic to have! If you wanted to give it a shot, it can be whatever you think is right, so you can expect fast turnarounds. gix is like a developer platform to be able to try new features outside of test sandboxes, and despite the aspiration for the sub-commands to be correct, they don't have to be fancy or groundbreaking. Usually this makes for a fun time when implementing those.

One day, with all the lessons learned in gix and everywhere else, I hope to boil it all down to the 'optimal' workflow that is then poured into ein, probably not even as ein diff but as part of the natural development workflow that it guides people through.

โ€ฆand the only thing imara-diff is potentially missing is a special purpose algorithm for performing the word diff inside the hunks (git doesn't use its normal diff algorithm for word diffs and I still need to dig into what they are using instead).

I am absolutely looking forward to it and think that word diffing would indeed be a good candidate to try out via gix diff, along which a decent API could also be developed.

That sounds great, I would love to give an implementation of gix diff a shot! I am still working on ropey this week to make line-diffs faster for helix and might be a little busy next weeek but once I get a chance I will definitely try my hand at this. A gix diff CLI would definitely make working on imara-diff much much easier.

Is there some avenue where I might ask a few questions when I start working on that? Issues doesn't seem like the right place for development discussion

Byron commented 1 year ago

That sounds great, I would love to give an implementation of gix diff a shot!

โค๏ธ๐ŸŽ‰

A gix diff CLI would definitely make working on imara-diff much much easier.

Most definitely to play around by hand a bit more. I'd also believe that thanks to the availability of unified diff generation it will be straightforward to write neat tests with insta for example. And depending on how close the implementation wants to be to git, one can also write baseline tests that validate the own algorithm against the output that git produces.

Is there some avenue where I might ask a few questions when I start working on that? Issues doesn't seem like the right place for development discussion.

You could use discussions on GitHub or ping me keybase. And I am also on the in zulip which is probably an even better place. Looking forward, happy to help!

Byron commented 1 year ago

On another note: have you considered applying for a Rust foundation grant? To me it seems obvious why you should receive additional support to do what you do. For reference, when I applied for the grant I am working on right now, it took about an hour to put it together. Since then, I have spent no more than 2h in total for the grant administration, so it's safe to say that it is very low maintenance and it's very pleasant to interact with the folks from the Rust Foundation as well.

If you have any questions about this or there is anything I can do to help, please let me know, happy to help.