protocol / prodeng

Issues, discussions and documentation from the production engineering team
2 stars 1 forks source link

Epic: Increase data locality in Gateway #6

Open iand opened 2 years ago

iand commented 2 years ago

What Is It?

Improve gateway TTFB by ensuring that more requests are served using a local blockstore instead of waiting on network block discovery.

Notes

This epic covers improvements to TTFB that can be achieved by ensuring that more requests are served using a local blockstore instead of waiting on network block discovery. Popular and expensive to fetch data should be held locally for longer by gateways, while still respecting limits on disk usage.

Description

When operating as an HTTP gateway, go-ipfs should adopt a garbage collection strategy that retains blocks that improve gateway request times.

The hypothesis is that blocks are being deleted from the blockstore by garbage collection only to be re-requested at a later date which is much slower. Garbage collection is required to keep within disk space limits but the current implementation is unware of the usage patterns of unpinned blocks.

A crude experiment demonstrates the effect. The same URL (/ipfs/QmVoe22tWS8JqfrMHj5WTHJ6zWrWqTL3YeG5uFQmAH6zhs/182.png) was requested every 30 seconds with a manual GC run every few minutes (using ipfs repo gc)

Timings (elapsed time, connection time, time to start transfer, total time):
0,conn:0.092357,ttfb:9.629604,total:10.343198   
31,conn:0.092979,ttfb:0.190162,total:0.466788
62,conn:0.103032,ttfb:0.210427,total:0.517554
92,conn:0.103929,ttfb:0.212191,total:0.525963
123,conn:0.102814,ttfb:0.210029,total:0.519017
154,conn:0.093053,ttfb:2.338450,total:2.620145  <-- after GC
186,conn:0.093665,ttfb:0.191273,total:0.473441
217,conn:0.093083,ttfb:0.191259,total:0.468096
247,conn:0.103448,ttfb:0.211765,total:0.520322
278,conn:0.093548,ttfb:0.191549,total:0.469947
308,conn:0.103633,ttfb:0.212855,total:0.522967
339,conn:0.092336,ttfb:0.190009,total:0.466366
369,conn:0.093359,ttfb:0.191401,total:0.469183
400,conn:0.093163,ttfb:0.191280,total:0.473141
430,conn:0.093641,ttfb:0.200777,total:0.478957
461,conn:0.103439,ttfb:0.212652,total:0.520436
491,conn:0.102797,ttfb:0.210385,total:0.518253
522,conn:0.103191,ttfb:0.212262,total:0.522594
552,conn:0.092788,ttfb:0.190656,total:0.468532
583,conn:0.092756,ttfb:1.081157,total:1.358370   <-- after GC 
614,conn:0.104042,ttfb:0.217775,total:0.527639
645,conn:0.092954,ttfb:0.190958,total:0.472610
675,conn:0.094487,ttfb:0.193336,total:0.474542
706,conn:0.093254,ttfb:63.240915,total:63.519715  <-- after GC
799,conn:0.103447,ttfb:0.211192,total:0.519018
830,conn:0.102939,ttfb:2.021556,total:2.327899   <-- after GC 
862,conn:0.103380,ttfb:0.211327,total:0.520165
893,conn:0.103518,ttfb:0.211405,total:0.520424
923,conn:0.092240,ttfb:0.189054,total:0.467218
954,conn:0.093287,ttfb:0.190607,total:0.467807
984,conn:0.103094,ttfb:2.554306,total:2.862080    <-- after GC
1017,conn:0.103841,ttfb:0.212136,total:0.522242
1048,conn:0.092973,ttfb:0.190636,total:0.470584

The request times at 154, 583, 830 and 984 after a garbage collection show a tenfold increase in time to first byte. The timings at 706 are much higher, indicating that the peer probably dropped from node's peer list.

The default garbage collection strategy is mark and sweep:

bestEffortRoots, by default, only includes the MFS root

A cache-aware garbage collection algorithm would assign a cost to blocks derived from the frequency of access and time taken to fetch from peers. The algorithm then deletes blocks with least cost until a sufficient space has been freed.

For space efficiency we may want to only record metrics for roots of dags, i.e. directories and files. In this case the garbage collector would treat all descendant blocks of the root as a unit: either all are deleted together or none are. The final scoring formula will probably want to factor in the size of the dag.

Outline of proposed approach:

Tasks