clecat commented 1 year ago

This PR changes the writing of control files: They were currently updated while making the assumption that the file itself being smaller than a page, the writing would be atomic. However it was deemed dubious that we could use such an assumption. This is why this PR:

Adds the usage of temporary control file when writing a new payload (path of the cf + .tmp).
Moves the temporary file in stead of the old control file.
Then updates the Io file stored in the Control_file.t with the new
On the RO instances side, it simply re-opens a new Io file, as the old file descriptor might point to the inode of a file replaced by the RW instance.
Updates the errors of the related functions.

This leaves however one question in the case of RO instances: Because we open a new Io each time we want to reload the payload, storing said Io has currently purpose. Maybe we could do something about that (separating RO & RW control files ? Using an option ? Remove the reload function ?)

codecov-commenter commented 1 year ago

Codecov Report

Merging #2206 (00b7f61) into main (eddf517) will decrease coverage by 0.01%. The diff coverage is 85.71%.

:exclamation: Current head 00b7f61 differs from pull request most recent head 9a33ae0. Consider uploading reports for the commit 9a33ae0 to get more accurate results

:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main    #2206      +/-   ##
==========================================
- Coverage   68.04%   68.04%   -0.01%     
==========================================
  Files         135      135              
  Lines       16378    16390      +12     
==========================================
+ Hits        11144    11152       +8     
- Misses       5234     5238       +4

Impacted Files	Coverage Δ
src/irmin-pack/unix/control_file_intf.ml	`82.14% <ø> (ø)`
src/irmin-pack/unix/file_manager.ml	`84.43% <50.00%> (ø)`
src/irmin-pack/layout.ml	`81.35% <60.00%> (-2.29%)`	:arrow_down:
src/irmin-pack/unix/pack_store.ml	`79.12% <71.42%> (+0.10%)`	:arrow_up:
src/irmin-pack/unix/dispatcher.ml	`57.69% <75.00%> (-2.31%)`	:arrow_down:
src/irmin-pack/unix/utils.ml	`89.65% <92.85%> (-3.21%)`	:arrow_down:
src/irmin-pack/unix/control_file.ml	`88.46% <100.00%> (+0.65%)`	:arrow_up:
src/irmin-pack/unix/lower.ml	`70.00% <100.00%> (+0.30%)`	:arrow_up:
src/irmin-pack/unix/sparse_file.ml	`76.85% <100.00%> (-1.86%)`	:arrow_down:
src/irmin-pack/unix/traverse_pack_file.ml	`68.82% <100.00%> (ø)`
... and 4 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

art-w commented 1 year ago

Thanks for looking into this! current-bench doesn't report anything alarming but it would be nice to also run the replay trace, I'm curious if the posix way has any impact on performances in the long run :)

clecat commented 1 year ago

Finally ran the replay trace thx to the help of @art-w & @adatario, here are the results:


</details>

 |                          | baseline  |       pr
 | --                       | --        | --
 | -- main metrics --       |           | 
 | CPU time elapsed         |   103m42s |   103m54s 100%
 | Wall time elapsed        |   103m43s |   103m55s 100%
 | TZ-transactions per sec  |   739.362 |   737.883 100%
 | TZ-operations per sec    |  4829.344 |  4819.686 100%
 | Context.add per sec      | 14179.475 | 14151.116 100%
 | tail latency (1)         |   1.416 s |   1.227 s  87%
 |                          |           | 
 | -- resource usage --     |           | 
 | disk IO (total)          |           | 
 |   IOPS (op/sec)          |   181_302 |   180_946 100%
 |   throughput (bytes/sec) |  18.099 M |  18.063 M 100%
 |   total (bytes)          | 112.608 G | 112.610 G 100%
 | disk IO (read)           |           | 
 |   IOPS (op/sec)          |   181_204 |   180_849 100%
 |   throughput (bytes/sec) |  10.648 M |  10.627 M 100%
 |   total (bytes)          |  66.249 G |  66.251 G 100%
 | disk IO (write)          |           | 
 |   IOPS (op/sec)          |        97 |        97 100%
 |   throughput (bytes/sec) |   7.451 M |   7.436 M 100%
 |   total (bytes)          |  46.359 G |  46.359 G 100%
 |                          |           | 
 | max memory usage (bytes) |   0.374 G |   0.391 G 104%
 | mean CPU usage           |      100% |      100%     ```

adatario commented 1 year ago

I think there was a bug in the replay that caused a wrong version of Irmin to be used. This was a mistake on my side.

The stats above should not be considered accurate. I'm re-running the benchmarks.

adatario commented 1 year ago

Here's the stats.txt for another benchmark run where the correct version (this PR commit) was used:

	baseline	pr
hostname	posada
os type	Unix
big endian	false
runtime params (no gc)	b=1,H=0,p=0,t=0,v=0,w=1,W=0
backend type	native
ocaml version	4.14.0
word size	64 bits
Start Time	2023/03/06 10:49:35 (GMT)	2023/03/15 09:30:27 (GMT)
Store Type	pack
irmin-pack GC count	0
gc.minor_heap_size	262144
gc.major_heap_increment	15
gc.space_overhead	120
gc.verbose	0
gc.max_overhead	500
gc.stack_limit	0
gc.allocation_policy	2
gc.window_size	1
gc.custom_major_ratio	44
gc.custom_minor_ratio	100
gc.custom_minor_max_size	8192

	baseline	pr
-- main metrics --
CPU time elapsed	106m05s	108m17s 102%
Wall time elapsed	106m05s	108m21s 102%
TZ-transactions per sec	722.768	708.047 98%
TZ-operations per sec	4720.957	4624.802 98%
Context.add per sec	13861.239	13578.917 98%
tail latency (1)	1.368 s	1.153 s 84%

-- resource usage --
disk IO (total)
IOPS (op/sec)	177_241	173_625 98%
throughput (bytes/sec)	17.693 M	17.332 M 98%
total (bytes)	112.611 G	112.609 G 100%
disk IO (read)
IOPS (op/sec)	177_146	173_531 98%
throughput (bytes/sec)	10.409 M	10.197 M 98%
total (bytes)	66.252 G	66.250 G 100%
disk IO (write)
IOPS (op/sec)	95	93 98%
throughput (bytes/sec)	7.284 M	7.135 M 98%
total (bytes)	46.359 G	46.359 G 100%

max memory usage (bytes)	0.386 G	0.388 G 100%
mean CPU usage	100%	100%

clecat commented 1 year ago

Nice, thx a lot, it should be good to merge now

metanivek commented 1 year ago

@adatario thanks for running the benchmark! Do you have the full stats.txt? I'd like to see some of the more detailed stats around commit times.

adatario commented 1 year ago

@adatario thanks for running the benchmark! Do you have the full stats.txt? I'd like to see some of the more detailed stats around commit times.

Yes, they are in the irmin-tezos-benchmarking repo: https://github.com/tarides/irmin-tezos-benchmarking/tree/main/benchmarks/2023-03-temp-control-file (currently not public)

metanivek commented 1 year ago

Thanks @adatario!

These are the lines I wanted to look at:

 |                                                   |          |        cumu         | share |      min       |       max       |      avg

...

 | commit duration (s)                               | baseline |       36min25s      |   34% |  8.123 ms      |   1.368  s      | 21.855 ms     
 |                                                   | pr       |       36min57s 101% |   34% |  8.721 ms 107% |   1.153  s  84% | 22.176 ms 101%

There is a slight overhead on average for commit (we write the control file on every commit), but it is very small. I think this change is good to go! :tada:

mirage / irmin

irmin-pack: Use temp control files upon writing #2206

Codecov Report