wwWallet / wallet-frontend

BSD 2-Clause "Simplified" License
24 stars 7 forks source link

Provide error handling mechanism when number of keypairs is exceeded #409

Open kkmanos opened 1 week ago

kkmanos commented 1 week ago

Due to the limited capacity of the UserEntity.privateData column of datatype 'blob', the size will be exceeded when a certain amount of key-pairs is reached for a user.

emlun commented 1 week ago

I did some testing to see what our current space consumption is. On commit cf562b88fd3374e4c97029315d569bab106d58ee I created a new account, then created three VID credentials and recorded the size of privateData after each. Since privateData is JSON, I also checked how well it compresses with the default settings of the Python zlib module. Results:

Number of VID credentials privateData size (bytes) Compressed size (bytes)
1 3128 2060
2 4444 (+1316) 3067 (+1007)
3 5760 (+1316) 4069 (+1002)

So it looks like we're at about 1 kB per credential. I think that individual keypairs may be slightly smaller than that, but that's at least a rough estimate. A BLOB can store up to 65536 kB, so we should expect to exceed the database capacity somewhere around ~60 credentials. MEDIUMBLOB is 16 MB, which we should expect to run out somewhere around ~16,000 credentials, and LONGBLOB (4 GB) somewhere around 4 million credentials.

kkmanos commented 1 week ago

That is a great analysis!

I brought this for discussion, to agree to the most appropriate long-term solution. This topic becomes even more interesting once batch issuance is in-place.

Since LONGBLOB is a variable-sized datatype, it is considered as a fine solution to avoid changing the datatype in the near future.

Changing the structure of the privateData storage can be a great topic for discussion in later stages. This is not a very simple task because we would like to maintain an anonymous profile as much as possible to the wallet-backend-server component.

What is your view on this?

emlun commented 6 days ago

Sounds good!

Changing the structure of the privateData storage can be a great topic for discussion in later stages. This is not a very simple task because we would like to maintain an anonymous profile as much as possible to the wallet-backend-server component.

Fortunately, the backend doesn't introspect the privateData in any way, it just dumps it into the database and reads it back out (converting between base64 encoding on the network and raw binary in the DB). In particular, I don't think the frontend would be affected at all by just increasing the size of the database column.

But changing the privateData structure should also be fairly straightforward - for example to introduce compression on the frontend. We'd simply need some way to determine how the payload should be decoded, for backwards compatibility with payloads that haven't been updated yet. This too should be fairly easy: since privateData is JSON encoded as UTF-8 binary, we know the first byte is always 0x7b ({) in the current (uncompressed) format. So we can simply add a prefix byte like 0x00 to indicate that the payload is compressed, for example (and we can even add up to 254 different versions of this in the future - for example if we need to change the compression algorithm - before needing another byte). Or it could be a sibling attribute alongside privateData, I suppose.

Another option is to simply compress the privateData in the backend before writing it to database, and decompress it before sending it back to the frontend. That would reduce complexity and CPU usage on the frontend at the expense of adding the same complexity and CPU usage to the backend, as well as using (very slightly) more network traffic.

There are good arguments in favour of each option: On the one hand, privateData is the frontend's business, so it makes sense that the frontend takes care of encoding (including compressing) it; on the other hand, the backend is what decides the storage limit, so it makes sense that the backend takes care of making the most of that storage space. Either option would work. I think I would personally favour frontend compression since saving network traffic is beneficial for people on slow networks, and client-side compute scales better to millions of users and matches wwWallet's (somewhat) decentralized design aspirations.

kkmanos commented 5 days ago

By saying "Changing the structure of the privateData...", I meant storing the keypairs in a separate database table, that's where anonymity needs to be maintained. This is a similar problem with encrypting the verifiable credentials in the database in their own database table.

I fully agree that changing the compression or the capacity (datatype) of the privateData column will not affect the anonymity or anything else.

I think I would personally favour frontend compression since saving network traffic is beneficial for people on slow networks, and client-side compute scales better to millions of users and matches wwWallet's (somewhat) decentralized design aspirations.

Yes, compression on the frontend seems to be a better solution based on the arguments that you proposed.

nvoutsin commented 5 days ago

The storage backend in production deployments will be quite different, and depending on the case, the wallet provider may choose among several advanced options. There is no need to make decisions on this now.

As a short-term solution, we only need to ensure that we won’t hit the limit after a couple of batches.