Open ggrossetie opened 4 months ago
I did another git bisect
(while making sure that bin
was deleted) and I can reproduce it with https://github.com/libgit2/libgit2sharp/commit/21d4f13ac7c739a5526cf088fbd8765d4ad12f57 where libgit2 was updated to 1.2.0
That was two and a half years ago. A lot has changed since then. It's going to be tricky to isolate the problem.
One question to start - what's the perf if you have one repository per thread instead of sharing the repository across threads?
Thanks for your reply!
One question to start - what's the perf if you have one repository per thread instead of sharing the repository across threads?
You mean one Repository
object (using the same path) per thread? I can try that ππ»
Also, I did try to implement a similar benchmark test in libgit2
but my skills in C are insufficient π
One question to start - what's the perf if you have one repository per thread instead of sharing the repository across threads?
Spot on! Using one Repository
instance per thread does the trick ππ»
Should we document how to properly use libgit2(sharp) in a multi-thread environment? Or maybe it's possible to (re)enable concurrent reads on a single Repository
instance?
Right β really good question. I was mostly hoping to stem the bleeding and get you to a performant situation. Digging in to the why it's slow would be really interesting.
https://github.com/libgit2/libgit2/blob/main/docs/threading.md#sharing-objects has a bit of a discussion about libgit2's threading policy (which LibGit2Sharp should also document as it should be identical β I don't think that there's anything in LibGit2Sharp that makes threading any different).
Without actually doing a serious investigation: probably you're hitting a lock on an object cache. Why that got worse, I don't know, and we should π that. But the most performant way to do what you're doing is multiple Repository
instances (although there will be a bit of a memory usage increase, since you won't have a cache shared across a single repository).
Thanks again for your insight.
libgit2/libgit2@main/docs/threading.md#sharing-objects has a bit of a discussion about libgit2's threading policy (which LibGit2Sharp should also document as it should be identical
ππ»
I don't think that there's anything in LibGit2Sharp that makes threading any different Without actually doing a serious investigation: probably you're hitting a lock on an object cache
Should we move this issue to https://github.com/libgit2/libgit2?
But the most performant way to do what you're doing is multiple Repository instances (although there will be a bit of a memory usage increase, since you won't have a cache shared across a single repository).
Alright! I wasn't sure that it was indeed the most performant solution. I will update my code base accordingly. Thank you!
Reproduction steps
CommitFixture.cs
:CanReadCommitParallel
takes 1417ms andCanReadCommit
takes 2971ms (which is fine)CanReadCommitParallel
takes 3731ms andCanReadCommit
takes 3250ms!Expected behavior
Reading files in parallel should be faster than reading files sequentially from the git tree.
Actual behavior
It seems that reading files from the tree in parallel (multi-thread) is not faster. I did a
git bisect
and it seems that this regression was introduced in https://github.com/libgit2/libgit2sharp/commit/21d4f13ac7c739a5526cf088fbd8765d4ad12f57Version of LibGit2Sharp (release number or SHA1)
Versions after https://github.com/libgit2/libgit2sharp/commit/21d4f13ac7c739a5526cf088fbd8765d4ad12f57
Operating system(s) tested; .NET runtime tested
.NET 6 and .NET 7.