Open collin opened 11 months ago
Thanks, I didn't know that existed
Looking into it it seems it only supports primitives as values like you said, which would be quite limiting 😔 https://github.com/yjs/y-utility/blob/98204141aead9bde38c6e9b569c2a3d597beaf58/y-keyvalue.js#L11
What if each of your points (or even the whole array of points) were frozen data (https://mobx-keystone.js.org/frozen-data) and the bindings would treat those frozen objects as plain values instead of YMaps/YArrays? would that help?
So for example, an snapshot like
{
$modelType: 'polygon',
points: {
$frozen: true,
data: [{x:1, y:1}, ...]
}
}
would bind into
YMap {
$modelType: 'polygon',
points: plain object (instead of YMap) {
$frozen: true,
data: [{x:1, y:1}, ...] (instead of YArray of YMaps)
}
}
I'm not entirely sure what the implications of that would be.
What I've done is store things based on whether or not it is a simple value.
{
$modelType: 'object',
x: 10,
y: 10,
width: 100,
height: 100,
background: 'red',
color: 'blue',
children: [{
$modelType: 'object',
/* and so on... */
}]
/* and so on... */
}
In my yjs adapter I split out the data depending on whether or not it is a simple type, something like:
Y.Map {
$modelType: 'object',
data: Y.Array (controlled by YKeyValue) [
{ key: 'x', val: 10 },
{ key: 'y', val: 10 },
{ key: 'width', val: 100 },
{ key: 'y', val: 100 },
/* and so on, this structure is maintained by YKeyValue, so I never directly interact with it as an array */
]
children: Y.Array [
Y.Map {
$modelType: 'object',
data: Y.Array (controlled by YKeyValue) [
{ key: 'x', val: 10 },
{ key: 'y', val: 10 },
/* and so on... */
},
children: Y.Array []
]
}]
}
Sort of like that.
Each model is stored in a Y.Map. The model's Y.Map has one Y.KeyValue that stores "primitive" props. Complex props like Models, Maps, Sets, Arrays, and Refs get attached as nested Y.Map/Y.Array instances as neccessary.
( In practice I actually have one root Document model that has a flat map of all the objects in the document. Then all the relationships between the models are maintained as refs that resolve to the root map. That way when I move models around I don't actually have to move large parts of the YJS document around, just lightweight refs. )
Hi, I've been looking into it and doing some tests:
YKeyValue (randomly setting keys)
writing 100k updates on 1000 keys 12410ms
Size of the encoded document V2: 26773
Size of the json document: 27891
YMap (randomly setting keys)
writing 100k updates on 1000 keys 269ms
Size of the encoded document V2: 213587
Size of the json document: 13891
YKeyValue map seems to save quite a bit of storage like you pointed out (at least for plain values, since that's all it supports right now), but at the same time it seems to be 46 times slower :-/
I've been looking into other CRDT libs and they seem to have the same issue though. For example, automerge uses around the same space as V2 (around 200KB) while LORO uses the same space as V1 (around 490KB).
I'll try to examine the source code to figure out the trick that makes it save space as well as what makes it slow (Perhaps there's a way to make it faster and, who knows, to support nested Y.js structures?). I have the feeling this is something that might be really useful if it were done in the Y.js side though, since saving space in plain objects seems like a very basic use-case.
In the meanwhile I released a v1.3 that saves frozen values as plain values in Y.js instead of transforming them deeply into Y.Map and Y.Array (I don't know if that might help in your particular case, but just FYI).
Yeah, that's a pretty 👀 speed difference.
Running a similar shootout, it seems like the number of keys makes a big difference.
I ran with 10_000 random updates:
1000 Keys | 500 Keys | 250 Keys | 125 Keys | 75 Keys | 40 Keys | 20 Keys | 10 Keys | 5 Keys | 1 Key | |
---|---|---|---|---|---|---|---|---|---|---|
Y.Map |
37ms | 43ms | 26ms | 24ms | 23ms | 24ms | 28ms | 22ms | 22ms | 24ms |
YKeyValue |
575ms | 306ms | 171ms | 105ms | 95ms | 71ms | 65ms | 55ms | 64ms | 50ms |
Not very rigorous benchmarking, just one run, but the trend seems credible. YKeyValue
is dominated by the number of keys stored, and Y.Map
is just depends on the number of updates.
looking at the implementation, this makes a lot of sense. It looks like YKeyValue
is converting the internal data into a javascript Array for every change. Looking closer at the implementation of toArray
on Y.Array
, it is indeed looping through the array every time.
For my case, I'm unlikely to do more than tens of updates per frame at 60FPS, and I'm in the range of < 100 keys. At that scale updates always sub-millisecond no matter which data type, so document size matters much more to me.
I was looking at the recent release and saw there are yjs bindings. I've just spent some time building out yjs bindings for a mobx-keystone backed application.
One thing I encountered was using Y.Map for simple props can lead to very inefficient data storage. This is mentioned somewhat out-of-the-way in the yjs/y-utility repository.
The specific case I had was a model storing positional data: ~like
Due to the lack of data pruning in
Y.Map
while doing interleaved writes of plain values, a relatively short editing session on the order of minutes would lead to storing hundreds ok kb of data.When I switched to using the
YKeyValue
utility, the stored size of my documents stays very small. I would be quite happy to use the new official bindings, but would need to find a way to useYKeyValue
for plain values. Is that something you think could be ~easily supported by the official bindings?