Closed mxsasha closed 2 years ago
I think "faster expiry" is the idea to put forward to the working group in the initial draft, simply because it is simpler. If WG participants have appetite for more complexity / efficiency, we can try the 'delta aggregation' route.
Hi Sasha,
Your numbers regarding the amount of changes per hour in RIPE DB are relatively aligned to the numbers that were extracted in a past research of mine. Based on the analysis you provided above:
I vote for faster expiry as well but I am afraid that we will end up implementing both strategies. But let's see what the community will advice us.
I agree. I'll write a PR.
In the wonderful analysis above you write:
"The Snapshot File will be the most often requested - mirror clients request it every time they want to check for new content, so it should remain small."
I guess you wanted to say "The Update Notification File..." since according to the design this is the one that mirror clients will consult every time they want to check for new changes.
I guess you wanted to say "The Update Notification File..." since according to the design this is the one that mirror clients will consult every time they want to check for new changes.
Ah yes, indeed. Snapshots should ideally be our least requested file ;)
Actually, since we're fully in agreement I went ahead and committed it to main in 398b156 6dc9240.
Discussed and we are staying with deleting deltas after 24 hrs, so this is finished.
Current design
To make sure we're all on the same page, the draft at time of writing basically says:
(This is also how RRDP works, which is where I took it from.)
Application to IRR
Some assumptions on behaviour:
Some numbers on the last point, volume of updates on November 6-12:
Due to NRTM I can't pin exactly how many Delta Files would have been produced, but certainly not more than above numbers.
For RIPE, the volume would be many small files: if exactly equally spead out, one Delta File every minute with 3 IRR object changes each. They're probably more clustered, so not quite as bad. For other IRRs, less of an issue.
RIPE has 6395804 objects in my copy. So if we assume RIPE always changes 5000 objects per day, and all objects are the same size, the Delta File expiration will not start until 1279 days of Deltas have been gathered - because only at that point the Snapshot File will be bigger.
This means that a client running three years behind on RIPE, will still find all Delta Files there, and will use them to catch up rather than reinitialise from the Snapshot File. In that process, the client will download 250.000 Delta Files, optimistically assuming one Delta File per 5 minutes due to clustering. These are pretty rough numbers, but order of magnitude works.
(Notable large ones missing in my list are RADB and APNIC, I think my mirror might have broken.)
Impact and possible solutions
Even if the total file size is reasonable, having a client download 250.000 Delta Files is rather impractical, so this requires a solution in the standard. We're still trying to meet a number of needs:
SnapshotUpdate Notification File will be the most often requested - mirror clients request it every time they want to check for new content, so it should remain small.I had two possible thoughts for now:
@stkonst also mentioned some alternate ideas about Delta Files, but I thought I'd lay out the current goals issues properly here :)