moov-io / ach

ACH implements a reader, writer, and validator for Automated Clearing House (ACH) files. The HTTP server is available in a Docker image and the Go package is available.
https://moov-io.github.io/ach/
Apache License 2.0
446 stars 149 forks source link

refactor: improve merge performance #1392

Closed adamdecaf closed 3 months ago

adamdecaf commented 3 months ago

This PR reimplements the "merge files" logic with a much faster implementation. We've also reinspected the Nacha rules and determined a File can contain the same trace number multiple times, which may break some users of the library.

goos: darwin
goarch: amd64
pkg: github.com/moov-io/ach
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
                                   │  master.txt   │               new2.txt                │
                                   │    sec/op     │    sec/op     vs base                 │
MergeFiles/MergeFiles-16              84.91µ ± ∞ ¹   53.82µ ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                               171.2µ         81.09µ        -52.63%

                                   │  master.txt   │                new2.txt                │
                                   │     B/op      │     B/op       vs base                 │
MergeFiles/MergeFiles-16             2.086Ki ± ∞ ¹   2.002Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                              4.308Ki         3.486Ki        -19.08%

                                   │ master.txt  │               new2.txt               │
                                   │  allocs/op  │  allocs/op   vs base                 │
MergeFiles/MergeFiles-16             14.00 ± ∞ ¹   24.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                              29.63         42.48        +43.37%
Full benchmarks ``` goos: darwin goarch: amd64 pkg: github.com/moov-io/ach cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz │ master.txt │ new2.txt │ │ sec/op │ sec/op vs base │ MergeFiles/MergeFiles-16 84.91µ ± ∞ ¹ 53.82µ ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_3Groups-16 204.29µ ± ∞ ¹ 96.87µ ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_5Groups-16 199.67µ ± ∞ ¹ 97.26µ ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_10Groups-16 210.27µ ± ∞ ¹ 83.68µ ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_100Groups-16 201.79µ ± ∞ ¹ 82.62µ ± ∞ ¹ ~ (p=1.000 n=1) ² geomean 171.2µ 81.09µ -52.63% ¹ need >= 6 samples for confidence interval at level 0.95 ² need >= 4 samples to detect a difference at alpha level 0.05 │ master.txt │ new2.txt │ │ B/op │ B/op vs base │ MergeFiles/MergeFiles-16 2.086Ki ± ∞ ¹ 2.002Ki ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_3Groups-16 5.065Ki ± ∞ ¹ 4.004Ki ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_5Groups-16 5.057Ki ± ∞ ¹ 4.004Ki ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_10Groups-16 5.338Ki ± ∞ ¹ 4.000Ki ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_100Groups-16 5.205Ki ± ∞ ¹ 4.012Ki ± ∞ ¹ ~ (p=1.000 n=1) ² geomean 4.308Ki 3.486Ki -19.08% ¹ need >= 6 samples for confidence interval at level 0.95 ² need >= 4 samples to detect a difference at alpha level 0.05 │ master.txt │ new2.txt │ │ allocs/op │ allocs/op vs base │ MergeFiles/MergeFiles-16 14.00 ± ∞ ¹ 24.00 ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_3Groups-16 35.00 ± ∞ ¹ 49.00 ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_5Groups-16 35.00 ± ∞ ¹ 49.00 ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_10Groups-16 37.00 ± ∞ ¹ 49.00 ± ∞ ¹ ~ (p=1.000 n=1) ² MergeFiles/MergeFiles_100Groups-16 36.00 ± ∞ ¹ 49.00 ± ∞ ¹ ~ (p=1.000 n=1) ² geomean 29.63 42.48 +43.37% ¹ need >= 6 samples for confidence interval at level 0.95 ² need >= 4 samples to detect a difference at alpha level 0.05 ```
adamdecaf commented 3 months ago

@ckbaum do you have a way to test this PR a bit? I'd like to make sure there are no bugs and will test internally with some of our larger uploads.

codecov-commenter commented 3 months ago

Codecov Report

Merging #1392 (0b0948f) into master (8cb8cbb) will increase coverage by 0.38%. Report is 1 commits behind head on master. The diff coverage is 84.90%.

:exclamation: Current head 0b0948f differs from pull request most recent head cb91ccf. Consider uploading reports for the commit cb91ccf to get more accurate results

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1392 +/- ## ========================================== + Coverage 84.25% 84.63% +0.38% ========================================== Files 82 82 Lines 8218 8174 -44 ========================================== - Hits 6924 6918 -6 + Misses 921 891 -30 + Partials 373 365 -8 ```