Closed aaronc closed 4 years ago
Here is one proposed solution:
NaturalKeyTable
doesn't use row ID and AutoUInt64Table
under the hoodIndex
stores keys as concat(indexKeyBytes, primaryKeyBytes, len(primaryKeyBytes)[0])
primaryKeyBytes
are limited to 255 bytes so that length can into a single byte at the end of the keyuint64
or []byte
because the single byte length at the end allows for splitting the indexKeyBytes and primaryKeyBytesUInt64Index
stores keys as concat(indexKeyUInt64, primaryKeyBytes)
so that we know that the index key is always a fixed 8 bytesWhat do you think of this solution @alpe and @ethanfrey?
I think this is a good step, but there are other gas inefficiencies in the design as well.
Such as storing secondary indexes as many (key, primary key) -> nil
entries, rather than one key -> []primary key
Such as storing secondary indexes as many (key, primary key) -> nil entries, rather than one key -> []primary key
IMHO we get away cheaper in terms of gas with @aaronc index design. We pay only the "flat" fees on Read, Write, Delete an not the byte size fees of the payload. For example.
Add new secondary index "foo" A) Storing a new index key "foo00000001" costs = 2000 (WriteCostFlat) B) For a new entry in "[]primary key" "foo" it is "2000 + 8*30 (WriteCostPerByte)" = 2240
Add another to secondary index "foo" for different rowID A) Storing as new index key "foo00000002" costs = 2000 (WriteCostFlat) B) For an additional entry in "[]primary key" "foo" it is "1000 (ReadCostFlat) +8 3(ReadCostPerByte) +2000 + 16 30 (WriteCostPerByte)" = 3504
Delete operations A) 1000 DeleteCost B) 1000 (ReadCostFlat) +8 3(ReadCostPerByte) + 1000 DeleteCost when last element otherwise + 2000 +x 30 (WriteCostPerByte)
Has operations A) 1000 HasCost B) 1000 (ReadCostFlat) +x * 3(ReadCostPerByte)
Iterations over prefix A) 30 IterNextCostFlat x (entries) B) 30 IterNextCostFlat + x 8 * 3(ReadCostPerByte) Here variant B is slightly more efficient. If we would consider pagination with a limited result set then option B is only more efficient with small sets.
Thanks for the detailed analysis @alpe! So it sounds like from what you're saying that the concat(indexKeyBytes, primaryKeyBytes, len(primaryKeyBytes)[0])
is a reasonable approach generally? I would have expected key -> []primary key
to be more efficient generally for small sets but it seems like because of gas, that's only the case for prefix iteration which is a bit surprising.
Anyway, if you think it makes sense, then let's move forward with this design.
I do want to note that we could get past the limitation of 255 bytes for primary keys by encoding a uvarint in reverse starting from the back of the index key, basically concat(indexKeyBytes, primaryKeyBytes, reverse(uvarint(len(primaryKeyBytes))))
. This would be a bit more complex to implement but would prevent surprises in edge cases.
I think concat(indexKeyBytes, primaryKeyBytes, len(primaryKeyBytes)[0])
could be used to optimize natural key table indexes.
I am not sure if saving 33 gas (1123 vs 1090 for a GroupMember
index entry) on Read on a secondary index is worth the effort and complexity. It is of course more with Create, Delete.
After the analysis above I did a small optimization for Read so that we have (153 vs 120 for a GroupMember
index entry)
I am not sure if saving 33 gas (1123 vs 1090 for a
GroupMember
index entry) on Read on a secondary index is worth the effort and complexity. It is of course more with Create, Delete.
I'm sorry, what is the context for 1123 vs 1090?
sorry, context is access via natural key index or by rowID on table:
_, err = k.groupMemberTable.GetOne(gCtx, m.NaturalKey(), &loaded)
= 1123 gas_, err = k.groupMemberTable.autoTable.GetOne(gCtx, 1, &loaded)
= 1090 gasTo be fair, this only about READ. Create (+2000) and Delete(+1000) come with extra costs that we could avoid by storing the natural key in the table and not the index.
See "price list" KVGasConfig
@alpe how does this impact the proposed redesign of Index
keys?
how does this impact the proposed redesign of
Index
keys?
It does not impact the redesign. It was an example that there are potential improvements on gas consumption that can have impact with the current design. Just about priorities.
@alpe is this still being worked on, or can we consider this closed?
The concerns were addressed by the redesign PR. We can close this
The current design forces every table to use a
uint64
row ID regardless of what the underlying primary key actually is for index efficiency. This creates some storage inefficiency which is compounded by high gas costs for single reads and writes beyond the gas cost for bytes.Some WIP work is in #3.