Closed midepeter closed 5 months ago
I put those two results in files so we can compare them with benchstat.
$ tail -v -n5 with*txt
==> with-pointer.txt <==
merge_bench_test.go:161: 71922 files merged into 43097 files
BenchmarkMergeFiles/MergeFiles_100Groups-16 71922 156825 ns/op 7780 B/op 216 allocs/op
PASS
ok github.com/moov-io/ach 83.807s
==> without-pointer.txt <==
merge_bench_test.go:161: 96330 files merged into 32010 files
BenchmarkMergeFiles/MergeFiles_100Groups-16 96330 130157 ns/op 6454 B/op 150 allocs/op
PASS
ok github.com/moov-io/ach 85.414s
Running benchstat
to compare them shows a pretty major decrease in without-pointer.txt
$ benchstat with-pointer.txt without-pointer.txt
goos: linux
goarch: amd64
pkg: github.com/moov-io/ach
cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
│ with-pointer.txt │ without-pointer.txt │
│ sec/op │ sec/op vs base │
MergeFiles/MergeFiles-16 68.73µ ± ∞ ¹ 63.27µ ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_3Groups-16 173.8µ ± ∞ ¹ 121.4µ ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_5Groups-16 159.3µ ± ∞ ¹ 125.3µ ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_10Groups-16 165.7µ ± ∞ ¹ 124.1µ ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_100Groups-16 156.8µ ± ∞ ¹ 130.2µ ± ∞ ¹ ~ (p=1.000 n=1) ²
geomean 137.7µ 109.2µ -20.66%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
│ with-pointer.txt │ without-pointer.txt │
│ B/op │ B/op vs base │
MergeFiles/MergeFiles-16 3.109Ki ± ∞ ¹ 3.334Ki ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_3Groups-16 8.598Ki ± ∞ ¹ 6.420Ki ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_5Groups-16 7.742Ki ± ∞ ¹ 6.414Ki ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_10Groups-16 8.160Ki ± ∞ ¹ 6.353Ki ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_100Groups-16 7.598Ki ± ∞ ¹ 6.303Ki ± ∞ ¹ ~ (p=1.000 n=1) ²
geomean 6.632Ki 5.598Ki -15.60%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
│ with-pointer.txt │ without-pointer.txt │
│ allocs/op │ allocs/op vs base │
MergeFiles/MergeFiles-16 88.00 ± ∞ ¹ 76.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_3Groups-16 245.0 ± ∞ ¹ 150.0 ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_5Groups-16 220.0 ± ∞ ¹ 150.0 ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_10Groups-16 232.0 ± ∞ ¹ 150.0 ± ∞ ¹ ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_100Groups-16 216.0 ± ∞ ¹ 150.0 ± ∞ ¹ ~ (p=1.000 n=1) ²
geomean 188.5 130.9 -30.52%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
Removing pointers from public API methods is a breaking change, so it's harder to merge and release as it will break everyone using the library.
Hm yes, you are so right.
What do you intend should be done?
We're talking it over internally to figure out how far we want to break things (if at all). I think we want to try removing pointers entirely from the project if the gains are this large, but that's an even larger change.
@ckbaum do you have any thoughts?
Oh Alright
Beautiful😊
Hi @adamdecaf
Trust you are doing great. Wanted to know if there is an opportunity to work with the moov team or at moov. I really like what is being done here.. I am pretty good with golang, rust and python. I current work part time for a fintech startup also but looking for a more challenging role and all.
I looked up careers page on moov website for software engineering role(backend) but there seem to a US based restriction🥲. Just want to know if there is an opportunity to work for moov. I could start out with maintaining the open source repos with some compensation for a while before integrating with the company fully if possible.
Or I reach out to @wadearnold directly?
Looking forward to hearing from you.
Thank you Peter 😊
Yes I'll follow up with you.
We're talking it over internally to figure out how far we want to break things (if at all). I think we want to try removing pointers entirely from the project if the gains are this large, but that's an even larger change.
@ckbaum do you have any thoughts?
Sorry late to this! Not the end of the world, but yeah we'd be a little grumbly if all the method signatures in the project changed with a version update. Partly because at least for my bank, performance of file-based operations is not a huge priority; ACH is naturally such a slow async process that I doubt we'd notice even a 20% speed improvement of any given operation.
But hey, faster is better 👍 we'd get over it. Especially if there are other breaking changes on the docket.
Thanks for the reply @ckbaum and I agree it may not be very noticeable in a lot of systems. I don't think this PR is enough to force a breaking change. Perhaps we can remove pointers internally and increase the speed without breaking calling code.
Yes I'll follow up with you.
Oh wow!
Thank you for this. Will be expecting from you😊
Thanks for the reply @ckbaum and I agree it may not be very noticeable in a lot of systems. I don't think this PR is enough to force a breaking change. Perhaps we can remove pointers internally and increase the speed without breaking calling code.
So you mean we should remove the internal pointers and leave the pointers around the calling code for the backward compatibility.
Thanks for the reply @ckbaum and I agree it may not be very noticeable in a lot of systems. I don't think this PR is enough to force a breaking change. Perhaps we can remove pointers internally and increase the speed without breaking calling code.
So you mean we should remove the internal pointers and leave the pointers around the calling code for the backward compatibility.
If we can keep the same public API this PR can be merged without forcing change on people who use it.
Okay perfect.
Let me work on this.
@adamdecaf
I have tried to implement it the way you said. I realized the code is tightly coupled across the structure and after i was able to get it. The perfomance changes were nothing to write home about. I think we should probably seek a different manner of improving this which i have started to work on and research about. I will give you heads up if i get any.
Okay thanks. I'm going to close this PR for now since it's a breaking change and we're not ready to adopt that. I appreciae the help and we can look into using less pointers internally to reduce memory operations.
This commit gives a benchmark report on the use of
[]*ach.File
and[]ach.File
. It intend to show the improvement in passing by values instead of passing by referenceMy benchmark results when using
[]*ach.File
Also here are the results when using the
[]ach.File
(without the pointer)