Gas inefficiency of current design

aaronc commented 4 years ago

The current design forces every table to use a uint64 row ID regardless of what the underlying primary key actually is for index efficiency. This creates some storage inefficiency which is compounded by high gas costs for single reads and writes beyond the gas cost for bytes.

Some WIP work is in #3.

aaronc commented 4 years ago

Here is one proposed solution:

NaturalKeyTable doesn't use row ID and AutoUInt64Table under the hood
Index stores keys as concat(indexKeyBytes, primaryKeyBytes, len(primaryKeyBytes)[0])
- primaryKeyBytes are limited to 255 bytes so that length can into a single byte at the end of the key
- this should allow prefix scans to happen efficiently whether or not primary key is actually uint64 or []byte because the single byte length at the end allows for splitting the indexKeyBytes and primaryKeyBytes
UInt64Index stores keys as concat(indexKeyUInt64, primaryKeyBytes) so that we know that the index key is always a fixed 8 bytes

What do you think of this solution @alpe and @ethanfrey?

ethanfrey commented 4 years ago

I think this is a good step, but there are other gas inefficiencies in the design as well.

Such as storing secondary indexes as many (key, primary key) -> nil entries, rather than one key -> []primary key

alpe commented 4 years ago

Such as storing secondary indexes as many (key, primary key) -> nil entries, rather than one key -> []primary key

IMHO we get away cheaper in terms of gas with @aaronc index design. We pay only the "flat" fees on Read, Write, Delete an not the byte size fees of the payload. For example.

Add new secondary index "foo" A) Storing a new index key "foo00000001" costs = 2000 (WriteCostFlat) B) For a new entry in "[]primary key" "foo" it is "2000 + 8*30 (WriteCostPerByte)" = 2240
Add another to secondary index "foo" for different rowID A) Storing as new index key "foo00000002" costs = 2000 (WriteCostFlat) B) For an additional entry in "[]primary key" "foo" it is "1000 (ReadCostFlat) +8 3(ReadCostPerByte) +2000 + 16 30 (WriteCostPerByte)" = 3504
Delete operations A) 1000 DeleteCost B) 1000 (ReadCostFlat) +8 3(ReadCostPerByte) + 1000 DeleteCost when last element otherwise + 2000 +x 30 (WriteCostPerByte)
Has operations A) 1000 HasCost B) 1000 (ReadCostFlat) +x * 3(ReadCostPerByte)
Iterations over prefix A) 30 IterNextCostFlat x (entries) B) 30 IterNextCostFlat + x 8 * 3(ReadCostPerByte) Here variant B is slightly more efficient. If we would consider pagination with a limited result set then option B is only more efficient with small sets.

aaronc commented 4 years ago

Thanks for the detailed analysis @alpe! So it sounds like from what you're saying that the concat(indexKeyBytes, primaryKeyBytes, len(primaryKeyBytes)[0]) is a reasonable approach generally? I would have expected key -> []primary key to be more efficient generally for small sets but it seems like because of gas, that's only the case for prefix iteration which is a bit surprising.

Anyway, if you think it makes sense, then let's move forward with this design.

I do want to note that we could get past the limitation of 255 bytes for primary keys by encoding a uvarint in reverse starting from the back of the index key, basically concat(indexKeyBytes, primaryKeyBytes, reverse(uvarint(len(primaryKeyBytes)))). This would be a bit more complex to implement but would prevent surprises in edge cases.

alpe commented 4 years ago

I think concat(indexKeyBytes, primaryKeyBytes, len(primaryKeyBytes)[0]) could be used to optimize natural key table indexes.

I am not sure if saving 33 gas (1123 vs 1090 for a GroupMember index entry) on Read on a secondary index is worth the effort and complexity. It is of course more with Create, Delete.

After the analysis above I did a small optimization for Read so that we have (153 vs 120 for a GroupMember index entry)

aaronc commented 4 years ago

I am not sure if saving 33 gas (1123 vs 1090 for a GroupMember index entry) on Read on a secondary index is worth the effort and complexity. It is of course more with Create, Delete.

I'm sorry, what is the context for 1123 vs 1090?

alpe commented 4 years ago

sorry, context is access via natural key index or by rowID on table:

_, err = k.groupMemberTable.GetOne(gCtx, m.NaturalKey(), &loaded) = 1123 gas
_, err = k.groupMemberTable.autoTable.GetOne(gCtx, 1, &loaded) = 1090 gas

To be fair, this only about READ. Create (+2000) and Delete(+1000) come with extra costs that we could avoid by storing the natural key in the table and not the index.

See "price list" KVGasConfig

aaronc commented 4 years ago

@alpe how does this impact the proposed redesign of Index keys?

alpe commented 4 years ago

how does this impact the proposed redesign of Index keys?

It does not impact the redesign. It was an example that there are potential improvements on gas consumption that can have impact with the current design. Just about priorities.

clevinson commented 4 years ago

@alpe is this still being worked on, or can we consider this closed?

alpe commented 4 years ago

The concerns were addressed by the redesign PR. We can close this

regen-network / cosmos-modules

Gas inefficiency of current design #6