moov-io / ach

ACH implements a reader, writer, and validator for Automated Clearing House (ACH) files. The HTTP server is available in a Docker image and the Go package is available.
https://moov-io.github.io/ach/
Apache License 2.0
447 stars 150 forks source link

perf: 'mergefiles' improvements #1370

Closed midepeter closed 5 months ago

midepeter commented 5 months ago

This commit gives a benchmark report on the use of []*ach.File and []ach.File. It intend to show the improvement in passing by values instead of passing by reference

My benchmark results when using []*ach.File

go test . -bench BenchmarkMergeFiles -v -run BenchmarkMergeFiles --benchtime 10s goos:darwin
goos: linux
goarch: amd64
pkg: github.com/moov-io/ach
cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
BenchmarkMergeFiles
BenchmarkMergeFiles/MergeFiles
    merge_bench_test.go:148: 1 files merged into 1 files
    merge_bench_test.go:148: 100 files merged into 40 files
    merge_bench_test.go:148: 10000 files merged into 4000 files
    merge_bench_test.go:148: 180822 files merged into 72330 files
BenchmarkMergeFiles/MergeFiles-16                 180822             68727 ns/op            3184 B/op         88 allocs/op
BenchmarkMergeFiles/MergeFiles_3Groups
    merge_bench_test.go:152: 1 files merged into 1 files
    merge_bench_test.go:152: 100 files merged into 55 files
    merge_bench_test.go:152: 10000 files merged into 6005 files
    merge_bench_test.go:152: 76762 files merged into 57046 files
BenchmarkMergeFiles/MergeFiles_3Groups-16          76762            173807 ns/op            8804 B/op        245 allocs/op
BenchmarkMergeFiles/MergeFiles_5Groups
    merge_bench_test.go:155: 1 files merged into 1 files
    merge_bench_test.go:155: 100 files merged into 61 files
    merge_bench_test.go:155: 10000 files merged into 6096 files
    merge_bench_test.go:155: 73902 files merged into 45783 files
BenchmarkMergeFiles/MergeFiles_5Groups-16          73902            159337 ns/op            7928 B/op        220 allocs/op
BenchmarkMergeFiles/MergeFiles_10Groups
    merge_bench_test.go:158: 1 files merged into 1 files
    merge_bench_test.go:158: 100 files merged into 61 files
    merge_bench_test.go:158: 10000 files merged into 6039 files
    merge_bench_test.go:158: 75115 files merged into 51119 files
BenchmarkMergeFiles/MergeFiles_10Groups-16                 75115            165668 ns/op            8356 B/op        232 allocs/op
BenchmarkMergeFiles/MergeFiles_100Groups
    merge_bench_test.go:161: 1 files merged into 1 files
    merge_bench_test.go:161: 100 files merged into 40 files
    merge_bench_test.go:161: 10000 files merged into 6972 files
    merge_bench_test.go:161: 71922 files merged into 43097 files
BenchmarkMergeFiles/MergeFiles_100Groups-16                71922            156825 ns/op            7780 B/op        216 allocs/op
PASS
ok      github.com/moov-io/ach  83.807s

Also here are the results when using the []ach.File (without the pointer)

go test . -bench BenchmarkMergeFiles -v -run BenchmarkMergeFiles --benchtime 10s goos:darwin
goos: linux
goarch: amd64
pkg: github.com/moov-io/ach
cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
BenchmarkMergeFiles
BenchmarkMergeFiles/MergeFiles
    merge_bench_test.go:148: 1 files merged into 1 files
    merge_bench_test.go:148: 100 files merged into 34 files
    merge_bench_test.go:148: 10000 files merged into 3334 files
    merge_bench_test.go:148: 200274 files merged into 66759 files
BenchmarkMergeFiles/MergeFiles-16                 200274             63268 ns/op            3414 B/op         76 allocs/op
BenchmarkMergeFiles/MergeFiles_3Groups
    merge_bench_test.go:152: 1 files merged into 1 files
    merge_bench_test.go:152: 100 files merged into 31 files
    merge_bench_test.go:152: 10000 files merged into 3331 files
    merge_bench_test.go:152: 101288 files merged into 33760 files
BenchmarkMergeFiles/MergeFiles_3Groups-16         101288            121395 ns/op            6574 B/op        150 allocs/op
BenchmarkMergeFiles/MergeFiles_5Groups
    merge_bench_test.go:155: 1 files merged into 1 files
    merge_bench_test.go:155: 100 files merged into 31 files
    merge_bench_test.go:155: 10000 files merged into 3331 files
    merge_bench_test.go:155: 99624 files merged into 33206 files
BenchmarkMergeFiles/MergeFiles_5Groups-16          99624            125300 ns/op            6568 B/op        150 allocs/op
BenchmarkMergeFiles/MergeFiles_10Groups
    merge_bench_test.go:158: 1 files merged into 1 files
    merge_bench_test.go:158: 100 files merged into 31 files
    merge_bench_test.go:158: 10000 files merged into 3329 files
    merge_bench_test.go:158: 93589 files merged into 31191 files
BenchmarkMergeFiles/MergeFiles_10Groups-16                 93589            124098 ns/op            6505 B/op        150 allocs/op
BenchmarkMergeFiles/MergeFiles_100Groups
    merge_bench_test.go:161: 1 files merged into 1 files
    merge_bench_test.go:161: 100 files merged into 34 files
    merge_bench_test.go:161: 10000 files merged into 3301 files
    merge_bench_test.go:161: 96330 files merged into 32010 files
BenchmarkMergeFiles/MergeFiles_100Groups-16                96330            130157 ns/op            6454 B/op        150 allocs/op
PASS
ok      github.com/moov-io/ach  85.414s
adamdecaf commented 5 months ago

I put those two results in files so we can compare them with benchstat.

$ tail -v -n5 with*txt

==> with-pointer.txt <==
    merge_bench_test.go:161: 71922 files merged into 43097 files
BenchmarkMergeFiles/MergeFiles_100Groups-16                71922            156825 ns/op            7780 B/op        216 allocs/op
PASS
ok      github.com/moov-io/ach  83.807s

==> without-pointer.txt <==
    merge_bench_test.go:161: 96330 files merged into 32010 files
BenchmarkMergeFiles/MergeFiles_100Groups-16                96330            130157 ns/op            6454 B/op        150 allocs/op
PASS
ok      github.com/moov-io/ach  85.414s

Running benchstat to compare them shows a pretty major decrease in without-pointer.txt

$ benchstat with-pointer.txt without-pointer.txt 
goos: linux
goarch: amd64
pkg: github.com/moov-io/ach
cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
                                   │ with-pointer.txt │          without-pointer.txt          │
                                   │      sec/op      │    sec/op     vs base                 │
MergeFiles/MergeFiles-16                 68.73µ ± ∞ ¹   63.27µ ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_3Groups-16         173.8µ ± ∞ ¹   121.4µ ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_5Groups-16         159.3µ ± ∞ ¹   125.3µ ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_10Groups-16        165.7µ ± ∞ ¹   124.1µ ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_100Groups-16       156.8µ ± ∞ ¹   130.2µ ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                  137.7µ         109.2µ        -20.66%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                   │ with-pointer.txt │          without-pointer.txt           │
                                   │       B/op       │     B/op       vs base                 │
MergeFiles/MergeFiles-16                3.109Ki ± ∞ ¹   3.334Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_3Groups-16        8.598Ki ± ∞ ¹   6.420Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_5Groups-16        7.742Ki ± ∞ ¹   6.414Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_10Groups-16       8.160Ki ± ∞ ¹   6.353Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_100Groups-16      7.598Ki ± ∞ ¹   6.303Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                 6.632Ki         5.598Ki        -15.60%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                   │ with-pointer.txt │         without-pointer.txt          │
                                   │    allocs/op     │  allocs/op   vs base                 │
MergeFiles/MergeFiles-16                  88.00 ± ∞ ¹   76.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_3Groups-16          245.0 ± ∞ ¹   150.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_5Groups-16          220.0 ± ∞ ¹   150.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_10Groups-16         232.0 ± ∞ ¹   150.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
MergeFiles/MergeFiles_100Groups-16        216.0 ± ∞ ¹   150.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                   188.5         130.9        -30.52%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
adamdecaf commented 5 months ago

Removing pointers from public API methods is a breaking change, so it's harder to merge and release as it will break everyone using the library.

midepeter commented 5 months ago

Hm yes, you are so right.

What do you intend should be done?

adamdecaf commented 5 months ago

We're talking it over internally to figure out how far we want to break things (if at all). I think we want to try removing pointers entirely from the project if the gains are this large, but that's an even larger change.

@ckbaum do you have any thoughts?

midepeter commented 5 months ago

Oh Alright

Beautiful😊

midepeter commented 5 months ago

Hi @adamdecaf

Trust you are doing great. Wanted to know if there is an opportunity to work with the moov team or at moov. I really like what is being done here.. I am pretty good with golang, rust and python. I current work part time for a fintech startup also but looking for a more challenging role and all.

I looked up careers page on moov website for software engineering role(backend) but there seem to a US based restriction🥲. Just want to know if there is an opportunity to work for moov. I could start out with maintaining the open source repos with some compensation for a while before integrating with the company fully if possible.

Or I reach out to @wadearnold directly?

Looking forward to hearing from you.

Thank you Peter 😊

adamdecaf commented 5 months ago

Yes I'll follow up with you.

ckbaum commented 5 months ago

We're talking it over internally to figure out how far we want to break things (if at all). I think we want to try removing pointers entirely from the project if the gains are this large, but that's an even larger change.

@ckbaum do you have any thoughts?

Sorry late to this! Not the end of the world, but yeah we'd be a little grumbly if all the method signatures in the project changed with a version update. Partly because at least for my bank, performance of file-based operations is not a huge priority; ACH is naturally such a slow async process that I doubt we'd notice even a 20% speed improvement of any given operation.

But hey, faster is better 👍 we'd get over it. Especially if there are other breaking changes on the docket.

adamdecaf commented 5 months ago

Thanks for the reply @ckbaum and I agree it may not be very noticeable in a lot of systems. I don't think this PR is enough to force a breaking change. Perhaps we can remove pointers internally and increase the speed without breaking calling code.

midepeter commented 5 months ago

Yes I'll follow up with you.

Oh wow!

Thank you for this. Will be expecting from you😊

midepeter commented 5 months ago

Thanks for the reply @ckbaum and I agree it may not be very noticeable in a lot of systems. I don't think this PR is enough to force a breaking change. Perhaps we can remove pointers internally and increase the speed without breaking calling code.

So you mean we should remove the internal pointers and leave the pointers around the calling code for the backward compatibility.

adamdecaf commented 5 months ago

Thanks for the reply @ckbaum and I agree it may not be very noticeable in a lot of systems. I don't think this PR is enough to force a breaking change. Perhaps we can remove pointers internally and increase the speed without breaking calling code.

So you mean we should remove the internal pointers and leave the pointers around the calling code for the backward compatibility.

If we can keep the same public API this PR can be merged without forcing change on people who use it.

midepeter commented 5 months ago

Okay perfect.

Let me work on this.

midepeter commented 5 months ago

@adamdecaf

I have tried to implement it the way you said. I realized the code is tightly coupled across the structure and after i was able to get it. The perfomance changes were nothing to write home about. I think we should probably seek a different manner of improving this which i have started to work on and research about. I will give you heads up if i get any.

adamdecaf commented 5 months ago

Okay thanks. I'm going to close this PR for now since it's a breaking change and we're not ready to adopt that. I appreciae the help and we can look into using less pointers internally to reduce memory operations.