Open kkmanos opened 1 week ago
I did some testing to see what our current space consumption is. On commit cf562b88fd3374e4c97029315d569bab106d58ee I created a new account, then created three VID credentials and recorded the size of privateData
after each. Since privateData
is JSON, I also checked how well it compresses with the default settings of the Python zlib
module. Results:
Number of VID credentials | privateData size (bytes) |
Compressed size (bytes) |
---|---|---|
1 | 3128 | 2060 |
2 | 4444 (+1316) | 3067 (+1007) |
3 | 5760 (+1316) | 4069 (+1002) |
So it looks like we're at about 1 kB per credential. I think that individual keypairs may be slightly smaller than that, but that's at least a rough estimate. A BLOB
can store up to 65536 kB, so we should expect to exceed the database capacity somewhere around ~60 credentials. MEDIUMBLOB
is 16 MB, which we should expect to run out somewhere around ~16,000 credentials, and LONGBLOB
(4 GB) somewhere around 4 million credentials.
That is a great analysis!
I brought this for discussion, to agree to the most appropriate long-term solution. This topic becomes even more interesting once batch issuance is in-place.
Since LONGBLOB
is a variable-sized datatype, it is considered as a fine solution to avoid changing the datatype in the near future.
Changing the structure of the privateData
storage can be a great topic for discussion in later stages. This is not a very simple task because we would like to maintain an anonymous profile as much as possible to the wallet-backend-server
component.
What is your view on this?
Sounds good!
Changing the structure of the
privateData
storage can be a great topic for discussion in later stages. This is not a very simple task because we would like to maintain an anonymous profile as much as possible to thewallet-backend-server
component.
Fortunately, the backend doesn't introspect the privateData
in any way, it just dumps it into the database and reads it back out (converting between base64 encoding on the network and raw binary in the DB). In particular, I don't think the frontend would be affected at all by just increasing the size of the database column.
But changing the privateData
structure should also be fairly straightforward - for example to introduce compression on the frontend. We'd simply need some way to determine how the payload should be decoded, for backwards compatibility with payloads that haven't been updated yet. This too should be fairly easy: since privateData
is JSON encoded as UTF-8 binary, we know the first byte is always 0x7b
({
) in the current (uncompressed) format. So we can simply add a prefix byte like 0x00
to indicate that the payload is compressed, for example (and we can even add up to 254 different versions of this in the future - for example if we need to change the compression algorithm - before needing another byte). Or it could be a sibling attribute alongside privateData
, I suppose.
Another option is to simply compress the privateData
in the backend before writing it to database, and decompress it before sending it back to the frontend. That would reduce complexity and CPU usage on the frontend at the expense of adding the same complexity and CPU usage to the backend, as well as using (very slightly) more network traffic.
There are good arguments in favour of each option: On the one hand, privateData
is the frontend's business, so it makes sense that the frontend takes care of encoding (including compressing) it; on the other hand, the backend is what decides the storage limit, so it makes sense that the backend takes care of making the most of that storage space. Either option would work. I think I would personally favour frontend compression since saving network traffic is beneficial for people on slow networks, and client-side compute scales better to millions of users and matches wwWallet's (somewhat) decentralized design aspirations.
By saying "Changing the structure of the privateData...", I meant storing the keypairs in a separate database table, that's where anonymity needs to be maintained. This is a similar problem with encrypting the verifiable credentials in the database in their own database table.
I fully agree that changing the compression or the capacity (datatype) of the privateData column will not affect the anonymity or anything else.
I think I would personally favour frontend compression since saving network traffic is beneficial for people on slow networks, and client-side compute scales better to millions of users and matches wwWallet's (somewhat) decentralized design aspirations.
Yes, compression on the frontend seems to be a better solution based on the arguments that you proposed.
The storage backend in production deployments will be quite different, and depending on the case, the wallet provider may choose among several advanced options. There is no need to make decisions on this now.
As a short-term solution, we only need to ensure that we won’t hit the limit after a couple of batches.
Due to the limited capacity of the
UserEntity.privateData
column of datatype 'blob', the size will be exceeded when a certain amount of key-pairs is reached for a user.