Open mDuo13 opened 3 years ago
While I really like the idea, I'm a bit worried about tools and other applications working with trust lines. There will be a "hard cut" at some point in time, where the amendment gets enabled. (If validators agree to enable it) . But no one really can predict when this will be, except for the last two weeks. (And even then it could be vetoed still)
So, it might make sense, as you described above, to have the ability to receive RippleState objects still in the old format.
But the problem is that there are (at least 3?) Commands where you can receive a RippleState object with ( ledgerentry, ledger data, account_lines). And I agree that this would be a bit messy. At one point we would need to stop providing the old format or people would never switch over, using the new format for their tools.
But once the "fake old RippleState" is implemented, it will be hard to get rid of it.
I'm not sure if there's much benefit to do this optimization throughout the whole system and even out to the API - it might be enough to only optimize the in-memory object in the most common cases.
Clients that fetch raw ledger data (such as through account_objects or ledger_entry) or read transaction metadata would have to update to be able to read the new fields. This is significant because it affects code that interprets balance changes of issued currencies. However, the delivered_amount field would be unchanged, so this would only affect clients doing some relatively detailed processing.
I think we can side step this, at least in the case where a user requests data in expanded JSON format (setting binary
to false
). We can just modify the output to be consistent with the old format, just like we plan to do for account_lines
. For the case where binary
is set to true
, the situation is trickier. However, I don't think we want to modify the output in that situation; I think users that request binary data want to see that data exactly as it is on the ledger. We could just include some type of warning or flag that indicates "This object is a trustline in the new format" or some such.
I think it's ok that we would need to do the output modification in multiple places. We can just make a helper function that detects whether an object is a trustline, and do the modification if necessary. We could even embed the logic inside STObject::getJson()
.
I do worry though that this is overly complex, and might not be worth it, since the space savings would only be incremental.
I want to bring this topic up again. Some time has passed since June and we saw some "token craziness" recently (and still ongoing). The number of TrustLines has exploded, and so did the ledger size.
Currently, 2,3 million Trustlines account for 1 GB of ledger data. That is around 48% of the Ledger size! (https://xrpldata.com/api/v1/ledgerdata).
As you can see here:
Out of 2.3 million Trustlines, "only" 774k hold an actual Balance. All other Trustlines do not hold any value/token but add a big amount of data / size to the actual ledger. Maybe this could be a starting point to make some improvements. But as @mDuo13 already mentioned, there are many "duplicate" fields inside the RippleState object which could be omitted.
The mechanism to make sure this doesn't get out of hand is the "Owner Reserve" by the way, and this was recently cut down by 60% (from 5 to 2 XRP).
This means the 834479 objects from the issue description were about as expensive as 2.09 million trustlines now. Since the 5 XRP price already was no deterrent to creating ~2 million trust lines, I would expect that this will become a much larger issue with the cheap price now.
notes:
CheckCashMakesTrustline
because of the trust line bloat issue.The STCurrency
type introduced in #4789 likely makes this easier to do.
As of ledger 87716591
(5/2/2024, 7:06:10 PM UTC
), there were:
Flags
value of 0
HighNode
value of "0"
LowNode
value of "0"
0
LowLimit
nor the HighLimit
is 0
)Resulting savings: | Type | Count | Savings Each (bytes) | Total Savings (MB) |
---|---|---|---|---|
Total | 5,917,790 | 55 | 325.5 | |
Empty Flags |
22 | 5 | 0.00011 | |
Empty HighNode |
2,075,445 | 9 | 18.68 | |
Empty LowNode |
1,287,759 | 9 | 11.59 | |
0 Balance | 2,552,138 | 9 | 22.97 | |
Unidirectional | 5,915,733 | 9 | 53.24 |
The total savings would be 432 MB, or 7.5% of the whole ledger.
Summary
Looking at the numbers from nixerFFM's ledger data analysis, it seems like we could probably save a lot of space in the ledger by optimizing trust lines (RippleState objects). We could reduce trust lines' size significantly with just these optimizations:
Balance
Balance
/LowLimit
/HighLimit
)LowNode
/HighNode
fields optional (omit when their value would be 0, which is most common)Motivation
Total ledger size is a key constraint in scaling the XRP Ledger. To sync to the network, a server must download the entire latest state from its peers, an amount that is currently about 1 GB + overhead. This also affects bandwidth usage of servers in the peer-to-peer network, which is one of the more expensive and restrictive factors in running a reliable server, although state data probably contributes less to bandwidth than transactions themselves (aside: we should confirm this empirically). Furthermore, to be useful, a server should store ledger history, which includes all state data that has changed since the previous ledger version. (With de-duplication, storing 10 ledgers takes far less than 10× the space of storing 1 ledger, but the size of individual objects is still significant.)
RippleState objects (trust lines) account for 31% of the data in a given ledger (~350 MB), a surprising amount of which is unnecessary. Here's a breakdown of the size of a single RippleState object:
A significant amount of space is wasted in this representation, especially the use of "Amount" type fields (384 bits each) for
Balance
,LowLimit
, andHighLimit
:Balance
field (160 bits) is meaningless.value
(64 bits) of all three fields is included even when it's 0. I believe most trust lines are unidirectional, so we could save 64 bits or more per RippleState by omitting limit values of 0. (384 bits when all three—both limits and balance value—are 0)LowNode
andHighNode
fields are "hints" to the owner directory, with a value of 0 for any trust line that appears in the first page of its owner's directory—the most common case. (Accounts outnumber trust lines 3 million to 800k.) We could save 128 bits on most trust lines by omitting these fields when their value is 0.That adds up to a potential savings of between 480 and 992 bits per RippleState entry, with my guess being that on average the savings would be least 608 bits (76 bytes) each. This would be offset slightly by additional 1-3 bytes per optional field (when present) to identify it.
As of a recent ledger version, there are 834479 RippleState entries, so average savings of 76 bytes each adds up to about 63 MB or about 5.5% of the ledger's total data. That's comparable in size to everything in the ledger that's not an account, owner directory, or trust line combined.
Solution
We should introduce an amendment (proposed name:
OptimizeTrustLines
) that modifies RippleState objects as follows:LowNode
andHighNode
are optional fields, to be omitted when their value is 0.Balance
,LowLimit
andHighLimit
fields are legacy and should be removed whenever a trust line updated. In their place, "new-style" trust lines have the following fields:Currency
(internal type Hash160, 160 bits): the full currency code for this trust line.LineLowAccount
(internal type AccountID, 160 bits): the low accountLineHighAccount
(internal type AccountID, 160 bits): the high accountLineLowLimit
(64 bits): the low account's limit. Omitted if the limit is 0 (the default).LineHighLimit
(64 bits): the high account's limit. Omitted if the limit is 0 (the default).LineBalance
(64 bits): the current net balance of the trust line. Omitted if 0.(Credit to @nbougalis for brainstorming a lot of this.)
account_lines
would have to be updated to use the new fields (in addition to old fields) to return API responses. The actual response format foraccount_lines
could remain the same.account_objects
orledger_entry
) or read transaction metadata would have to update to be able to read the new fields. This is significant because it affects code that interprets balance changes of issued currencies. However, thedelivered_amount
field would be unchanged, so this would only affect clients doing some relatively detailed processing.Since we can't realistically migrate old trust lines to the new format en masse, any space savings would be gradual and incremental, and would probably be more in the form of avoiding future storage increases rather than reducing the present storage needs.
Paths Not Taken
This proposal does not include any changes to the
Flags
field although I think, realistically, the No Ripple settings are pretty confusing and not that useful, and we should change that—but that's a different issue.Nik originally suggested a 32-bit "Simple Asset Code" field to be used instead of the full 160-bit currency code for cases where the trust line is for a "standard" 3-character currency code, but I'm wary of making currency code data more confusing than it already is. Even though saving 16 bytes per object is significant, I think the fact that currency codes are always 160 bits under the surface is one of the rare cases of consistency in the XRP Ledger's protocol so I'd rather not ruin it. 😝
To reduce the integration burden for API clients, it would probably be possible to "fake" metadata in the old format. This sounds messy though and I worry the precedent could set us up for doing more of this kind of thing forever, which would make it really hard to introduce new features.
Another alternative would be to introduce an entirely new model of unidirectional trust lines linking back to a token definition object like the one proposed in #2609. While that might be more flexible and more powerful overall, it would be a much bigger migration that would require more action on the part of issuers, client apps, and so on. While the possibilities of a clean break and restart are tempting, I'm not convinced the legacy issued currency functionality is so broken as to warrant throwing it all out like this. Especially payment and offer processing (both highly complex, sensitive parts of the code) would probably require much more extensive rewrites to support such a change.