romstad / Chess.jl

Julia chess programming library.
https://romstad.github.io/Chess.jl/dev/
Other
103 stars 17 forks source link

Reduce allocations in `pgn.jl` by preallocating `IOBuffer` in `PGNReader` #31

Closed mcognetta closed 2 years ago

mcognetta commented 2 years ago

This reduces allocations in pgn.jl by preallocating an IOBuffer for PGNReader so that one isn't created each time we read a symbol.

For a test, I used:

julia> pgn = "1. e4 c6 2. Nf3 d5 3. exd5 cxd5 4. d4 Nc6 5. Nc3 e6 6. Be2 Bd6 7. O-O Nf6 8. Bg5 O-O 9. Qd2 Be7 10. Rfe1 Qc7 11. Bf4 Qd7 12. Bb5 a6 13. Ba4 b5 14. Bb3 Qa7 15. Nh4 Qd7 16. Bh6 gxh6 17. Qxh6 Ne4 18. Nxe4 dxe4 19. Rxe4 f5 20. Rxe6 Kh8 21. Nxf5 Rxf5 22. Rae1 Bf8 23. Qh3 Rg5 24. Re1e4 Qg7 25. g3 Bxe6 26. Bxe6 Re8 27. d5 Ne5 28. f4 Nf3+ 29. Kh1 Rg6 30. f5 Rg5 31. Kg2 Nd4 32. c3 Nxe6 33. dxe6 Bd6 34. g4 Qf6 35. Qd3 Bb8 36. Qd7 Qh6 37. e7 Qxh2+ 38. Kf3 Qg3+ 39. Ke2 Qg2+ 40. Kd1 Qxe4 41. Qxe8+ Rg8 42. Qf7 Qxg4+ 43. Kc2 Qg2+ 44. Kb3 a5 45. e8=Q a4+ 46. Ka3 Bd6+ 47. b4 axb3+ 48. Kxb3 Rxe8 49. Qxe8+ Kg7 50. Qd7+ Kh6 51. Qxd6+ Kg5 52. f6 Qf3 53. Qe5+ Kg6 54. Qxb5 Qxf6 55. a4 Qe6+ 56. Ka3 Qd6+ 57. Qb4 Qd3 58. a5 h5 59. Qb6+ Kg5 60. Qc5+ Kg4 61. Kb4 Qb1+ 62. Kc4 Qa2+ 63. Kd3 Qb1+ 64. Kc4 Qa2+ 65. Kd4 Qd2+ 66. Kc4 Qa2+ 67. Kb5 Qb3+ 68. Qb4+ Qxb4+ 69. Kxb4 h4 70. a6 h3 71. a7 h2 72. a8=Q h1=B 73. Qxh1 Kf4 74. c4 Kg3 75. Qe1+ Kf3 76. Qd2 Ke4 77. Qc3 Kf4 78. Qd3 Ke5 79. c5 Ke6 80. c6 Ke5 81. c7 Ke6 82. Qd4 Ke7 83. c8=Q Kf7 84. Qd6 Kg7 85. Qcc7+ Kg8 86. Qdd8# 1-0";

julia> s = Chess.PGN.gamefromstring(pgn);

The top allocations are as follows, with the item marked > being reduced by 80% in this PR:

 CoverageTools.MallocInfo(12320, "./san.jl.7492.mem", 54)
 CoverageTools.MallocInfo(16512, "./pgn.jl.7492.mem", 161)
 CoverageTools.MallocInfo(17968, "./game.jl.7492.mem", 792)
 CoverageTools.MallocInfo(19776, "./san.jl.7492.mem", 11)
 > CoverageTools.MallocInfo(49536, "./pgn.jl.7492.mem", 147)
 CoverageTools.MallocInfo(314640, "./san.jl.7492.mem", 34)

Note that most of the other heavy allocations here are addressed by #27 and #30.


Benchmarks show a modest speed up (-16% median time) and allocation reduction (-9.2% memory, -16.6% # allocations).

Main:

julia> @benchmark Chess.PGN.gamefromstring(pgn)
BenchmarkTools.Trial: 8865 samples with 1 evaluation.
 Range (min … max):  522.171 μs …   3.842 ms  ┊ GC (min … max): 0.00% … 85.66%
 Time  (median):     530.203 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   560.344 μs ± 255.895 μs  ┊ GC (mean ± σ):  4.48% ±  8.11%

  █▆▃▁                                                          ▁
  █████▇▆▅▅▅▆▆█▇▅▅▅▅▅▃▅▅▃▃▁▄▄▃▄▄▁▄▃▄▃▁▃▁▁▃▁▄▁▁▁▁▃▁▁▁▁▁▃▁▁▁▃▁▁▁▄ █
  522 μs        Histogram: log(frequency) by time       1.19 ms <

 Memory estimate: 441.47 KiB, allocs estimate: 3036.

This PR:

julia> @benchmark Chess.PGN.gamefromstring(pgn)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  436.290 μs …   3.350 ms  ┊ GC (min … max): 0.00% … 85.93%
 Time  (median):     445.344 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   473.730 μs ± 250.313 μs  ┊ GC (mean ± σ):  4.96% ±  7.98%

   █▂                                                            
  ▄██▄▆█▄▃▃▂▂▂▂▂▂▂▂▁▁▂▂▂▁▁▂▂▂▁▂▁▂▂▂▂▂▂▂▂▂▁▁▂▂▁▂▂▁▁▂▂▂▂▁▂▂▁▁▂▂▁▂ ▂
  436 μs           Histogram: frequency by time          658 μs <

 Memory estimate: 401.62 KiB, allocs estimate: 2532.

Personal Note: I apologize for the influx of PRs. I have been working on a project using this library, and wanted to upstream the optimizations I found during it.

romstad commented 2 years ago

No need to apologize for the influx of PRs, they are very welcome! These are all great improvements.

It is I who should apologize for reacting so slowly. I'm too busy these days.