Closed art-w closed 1 year ago
Merging #2232 (85d706e) into main (63e6e74) will increase coverage by
0.04%
. The diff coverage is76.62%
.:exclamation: Current head 85d706e differs from pull request most recent head ccaa77e. Consider uploading reports for the commit ccaa77e to get more accurate results
:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more
@@ Coverage Diff @@
## main #2232 +/- ##
==========================================
+ Coverage 67.97% 68.02% +0.04%
==========================================
Files 136 136
Lines 16564 16587 +23
==========================================
+ Hits 11260 11283 +23
Misses 5304 5304
Impacted Files | Coverage Δ | |
---|---|---|
src/irmin-pack/unix/gc_worker.ml | 3.55% <0.00%> (-0.05%) |
:arrow_down: |
src/irmin-pack/unix/store.ml | 65.43% <ø> (ø) |
|
src/irmin-pack/unix/control_file.ml | 85.61% <40.00%> (-1.71%) |
:arrow_down: |
src/irmin-pack/unix/sparse_file.ml | 84.49% <93.54%> (+4.16%) |
:arrow_up: |
src/irmin-pack/unix/control_file_intf.ml | 82.14% <100.00%> (ø) |
|
src/irmin-pack/unix/file_manager.ml | 84.98% <100.00%> (+0.29%) |
:arrow_up: |
src/irmin-pack/unix/gc.ml | 73.40% <100.00%> (ø) |
|
src/irmin-pack/unix/lower.ml | 61.07% <100.00%> (+0.23%) |
:arrow_up: |
... and 1 file with indirect coverage changes
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
I still need to take a close look at the code, but as discussed earlier, I ran some preliminary benchmarks locally based on the benchmark I did last week. Early results are very promising! :heart_eyes_cat: Below is a graph of average medians for two runs with LRU size of 0 (to focus on the mapping file perf) of no mmap (this PR) vs mmap (main, essentially). The title says "commit load times" but this is really "sample commits and load their entire trees"; it doesn't take this long to just "load a commit" :smile:.
Based on these results, I definitely think it is worth doing a full replay trace benchmark (and potentially looking at the other optimization ideas you have!).
Results of a benchmark run using a Tezos Context replay: https://github.com/tarides/irmin-tezos-benchmarking/blob/main/benchmarks/2023-04-no-mmap/stats.txt
Thanks a lot for your benchmarks and feedbacks! <3
@adatario > If you have some time, I think there's a small glitch with the memory usage graph as it's a bit flat during GC.. I'm guessing it's because the GC is running in a child process and doesn't get measured by smem
(?)
@adatario > If you have some time, I think there's a small glitch with the memory usage graph as it's a bit flat during GC.. I'm guessing it's because the GC is running in a child process and doesn't get measured by
smem
(?)
Here's a fixed graph:
@adatario @art-w what do you think explains the higher memory during GC (but not otherwise)?
I originally thought it might be because we are observing the memory that used to be mmap'd (and thus maybe hidden from smem
?), but then I would expect the entire graph to be shifted up. :thinking:
@art-w here is an updated bench with the "no bigarray" version. For now, lets stick with bigarray (unless you want to try a 1-D array as well). :sunglasses:
Thanks a lot @adatario !!
what do you think explains the higher memory during GC (but not otherwise)?
Yeah sorry I should have mentioned that I expected the memory usage to grow slightly during GC following PR #2215 (the current PR is unrelated). Previously an array was used to represent the Live
ranges of the mapping file during GC, and this old PR switched to a list (same quantity of elements, but about 2/3 larger in memory used). I'll push a tighter representation of the Live
ranges soon :)
Yeah sorry I should have mentioned that I expected the memory usage to grow slightly during GC following PR #2215 (the current PR is unrelated). Previously an array was used to represent the
Live
ranges of the mapping file during GC, and this old PR switched to a list (same quantity of elements, but about 2/3 larger in memory used). I'll push a tighter representation of theLive
ranges soon :)
Nice! I've started another benchmark run with the memory optimizing commit you pushed.
No need, I managed to run it! (but thanks!)
It's not yet as good as before hm:
Since the mapping file is read-only, we can simulate the mmap in userland by lazily reading chunks on demand (and avoid the file descriptor leak from mmaped bigarrays..) There are a bunch of optimizations that we could try here, but I'm curious how the benchmarks are impacted by this change before going too deep into this rabbit hole :)
(also added the expected
mapping_size
in the control file as a consistency check to detect a partially written mapping, after a gc or a snapshot)