y-crdt / yrb

Ruby bindings for yrs.
https://y-crdt.github.io/yrb
MIT License
80 stars 5 forks source link

Hash stored in map causes panick, type inconsistency between JS and Ruby #131

Open wkirby opened 11 months ago

wkirby commented 11 months ago

We are attempting to modify the same ydoc in the browser using yjs, and on our server using yrb. I have a pretty simple setup client side (this is simplified further for the sake of this report, but captures our use case fairly):

const ydoc = new Y.Doc();
const localBinder = bindImmerYjs(ydoc.getMap("data"));

localBinder.update((s) => {
  return {
    testString: "foo",
    testInt: 1,
    testFloat: 3.1415,
    testBool: false,
    testArray: [1, "two", true],
    testHash: { a: 1, b: "2", c: false }
  }
})

This ydoc is encoded as a uint8array, base64'd, then sent to the server where it's saved to a file store. In ruby, we load this content and try to access these values. This is mostly working. Roughly, this looks like:

ydata = Base64.decode64(stored_data)
ydoc = Y::Doc.new
ydoc.transact { |tx| tx.apply(ydata.bytes.to_a) }

ymap = ydoc.get_map("data")

All that is working, but this is where we run into problems:

# Works as expected
ymap[:testString] # => "foo"
ymap[:testBool] # => false
ymap[:testFloat] # => 3.1415

# Works close enough, but still wrong
ymap[:testInt] # => 1.0 (notice this is a Float now)

# Panick!
ymap[:testArray]
ymap[:testHash]

These Panicks are identical, producing:

thread '<unnamed>' panicked at /bundle/ruby/3.1.0/gems/y-rb-0.5.2/ext/yrb/.rb-sys/stable/cargo/registry/src/index.crates.io-6f17d22bba15001f/yrs-0.16.9/src/doc.rs:775:41:
called `Option::unwrap()` on a `None` value

There's an obvious workaround here where we can encode our hashes and arrays as a JSON string, but if we're going to do that, we might as well encode the whole map as a JSON string instead of as a map in the first place.

Note that this is not a problem when we decode these values on the client side.

wkirby commented 11 months ago

@eliias should I open this as an issue upstream with yrs?

eliias commented 11 months ago

@eliias should I open this as an issue upstream with yrs?

@wkirby I am traveling atm. I can take a look next week. If it's really an upstream issue I'll move the issue.

eliias commented 11 months ago

@wkirby I recall having a discussion w/ the other yrs folks about support for nested “complex” data structures (like Array, and Object). I need to check if it's technically possible to do that as long as you know what you do (there will be no conflict-free replication for nested objects). Text editors usually utilize the XML-like structure to achieve proper replication with nested data-structures.

Works close enough, but still wrong

ymap[:testInt] # => 1.0 (notice this is a Float now)

There is technically no Integer in JavaScript, so not sure if there is an easy “fix” for this. Probably best to do value.to_i when you know it should be an Integer in Ruby.

eliias commented 11 months ago

@wkirby I actually can't reproduce the panic, even using your example values. Could you provide me a value for ydata = Base64.decode64(stored_data) -> stored_data to allow me debug this further?

https://github.com/y-crdt/yrb/blob/main/spec/y/map_spec.rb#L179-L192

wkirby commented 11 months ago

@eliias Just got back from some time off myself. I'll send over the encoded document we have and work on setting up a more concise repro. There's always the chance that the error is on the encoding end with y-immer.

wkirby commented 11 months ago

@eliias attached is a file containing a Base64 string that presents the error. Here's how I reproduce with this data:

ydata = File.read('ydoc.base64.txt')
ydata = Base64.decode64(ydata)
ydoc = Y::Doc.new
ydoc.transact { |tx| tx.apply(ydata.bytes.to_a) }

ymap = ydoc.get_map("data")

ymap[:testInt] # => 1.0
ymap[:testFloat] # => 3.1415
ymap[:testString] # => "foo"
ymap[:testBool] # => false

ymap[:testArray] # => Panick!
ymap[:testHash] # => Panick!

ydoc.base64.txt

Because this blob is pulled directly from our application, the Map at data also contains the following keys:

title
fields
fieldValues
agents
signatories

In addition, the ydoc includes an XML fragment at content which is serialized tiptap state. I don't believe either of these are causing the issue, as the same problem was present without these values in the ydoc.

Here are the relevant lines from my Gemfile.lock:

    y-rb (0.5.2)
      rake (~> 13.0)
      rb_sys (~> 0.9.71)
    y-rb (0.5.2-x86_64-linux)
      rake (~> 13.0)
      rb_sys (~> 0.9.71)

And from package.json:

"immer-yjs": "^1.1.0",
...
"yjs": "^13.6.7",

On the client side, the code (roughly) to serialize this data:

const useSyncedData = (doc) => {
  const ydoc = useMemo(() => new Y.Doc(), [doc.id]);
  const [binder, setBinder] = useState(undefined);

  useEffect(() => {
    const localBinder = bindImmerYjs(ydoc.getMap("data"));

    if (localBinder) {
      setBinder(localBinder);

      // optionally set initial data
      localBinder.update((s) => {
        return {
          testString: "foo",
          testInt: 1,
          testFloat: 3.1415,
          testBool: false,
          testArray: [1, "two", true],
          testHash: { a: 1, b: "2", c: false }
        };
      });
    }

    return () => {
      localBinder.unbind();
    };
  }, [ydoc, doc]);

  return ydoc;
}

const MyReactComponent = () => {
  const { doc } = useDoc();
  const ydoc = useSyncedData(doc);

  console.log(ydoc);

  return <p>Oh hey, a component</p>
}
wkirby commented 11 months ago

@eliias hope I'm not bugging you, but were the above resources enough to reproduce? If not I'm happy to schedule a call to work through this together. Just shoot me an email at wyatt@apsis.io if you'd like.

eliias commented 11 months ago

title

Yeah, I am able to reproduce this in a test case with your data. I also found the actual issue, but I do not have an immediate solution that is shippable. The logic that converts the internal CRDT representation for a Map/Array requests a new read-only transaction on the CRDT store, which is usually fine. The Ruby library always maintains a read-write transaction, so it causes a panic in this case. It is also unclear to me why it only fails on a sync. It should fail in both cases.

FWIW… I was able to access your test data after a somewhat hacky fix, so it is definitely doable. https://github.com/y-crdt/yrb/pull/134

wkirby commented 11 months ago

@eliias I'm not super versed in Rust, so I can't comment on the hackiness present here. I really appreciate your attention on this, and do let me know if there's any other way I can assist!

eliias commented 10 months ago

@wkirby I am going to release 0.5.3 later today that includes a fix. The spec passes for your test data. Would be great if you can confirm it works for you.

wkirby commented 7 months ago

@eliias sorry, we moved on in our project to other things, but are returning now. I will confirm within the next week or so whether your implementation fixes our issue. Thank you so much!

Phaengris commented 7 months ago

@eliias sorry for re-using same issue for my problem, but I have feeling that it's same or very close

$ grep y-rb Gemfile.lock
    y-rb (0.5.3-x86_64-linux)
    y-rb_actioncable (0.1.6)
      y-rb (>= 0.4.5)
  y-rb_actioncable (~> 0.1.6)
<script type="module">
  import * as Y from "yjs";
  import {WebsocketProvider} from "@y-rb/actioncable";
  import {createConsumer} from "@rails/actioncable";

  const
    document = new Y.Doc(),
    consumer = createConsumer(),
    provider = new WebsocketProvider(
      document,
      consumer,
      'MyChannel',
      {signed_stream_name: '...'}
    ),
    yArray = document.getArray('array')
  const test = `test${Math.random()}`
  const yMap = new Y.Map()
  yMap.set(test, test)
  yArray.insert(0, [yMap])
  provider.connect()
</script>

Then in the Rails log

MyChannel#receive {"update"=>"AAFKAQK/lKqpBgAHAQVhcnJheQEoAL+UqqkGABZ0ZXN0MC42NjkwOTk1MzkyNDY1NjUyAXcWdGVzdDAuNjY5MDk5NTM5MjQ2NTY1MgA="}

thread '' panicked at 'called Option::unwrap() on a None value', /home/runner/work/yrb/yrb/tmp/cargo-vendor/yrs/src/doc.rs:775:41

and when I'm trying to reproduce it in the console

irb(main):004> require 'y-rb'; update = Base64.decode64("AAFKAQK/lKqpBgAHAQVhcnJheQEoAL+UqqkGABZ0ZXN0MC42NjkwOTk1MzkyNDY1NjUyAXcWdGVzdDAuNjY5MDk5NTM5MjQ2NTY1MgA=").unpack('C*'); doc = Y::Doc.new; doc.sync(update[3..-1]); doc.get_array('array').to_a
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', /home/runner/work/yrb/yrb/tmp/cargo-vendor/yrs/src/doc.rs:775:41
/home/fedora/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/y-rb-0.5.3-x86_64-linux/lib/y/array.rb:269:in `yarray_to_a': called `Option::unwrap()` on a `None` value (fatal)

(Why [3..-1] - because of the 1st byte is yrb-actioncable's message type, the 2nd byte is y-protocol message type and the 3rd byte is the length of the update)

If I put not an YMap into the array, but a primitive

  const test = `test${Math.random()}`
  const yMap = new Y.Map()
  yArray.insert(0, [test])
  provider.connect()

then it works like a charm

MyChannel#receive: {"update"=>"AAEpAQHs4JKHAgAIAQVhcnJheQF3FXRlc3QwLjU1NzkyODAyODMzNjEyMQA="}

MyChannel#receive synced: ["test0.557928028336121"]

and in the console

irb(main):005> require 'y-rb'; update = Base64.decode64("AAEpAQHs4JKHAgAIAQVhcnJheQF3FXRlc3QwLjU1NzkyODAyODMzNjEyMQA=").unpack('C*'); doc = Y::Doc.new; doc.sync(update[3..-1]); doc.get_array('array').to_a
=> ["test0.557928028336121"]

May be it's me doing something wrong?

eliias commented 7 months ago

@Phaengris It is possible that Maps in an Array aren't handled properly (still). I am going to add your case as a test and try to fix.

eliias commented 7 months ago

@Phaengris I identified the issue. Map/Array does not accept nested Y data-structures in yrb. It can be fixed, but it might take a while.

Phaengris commented 7 months ago

@eliias thank you for investigating this :)

No rush. If you can find time to fix it, it would make me happy, but if no - no worries.

For now I switched to use xml fragments / elements which also supports data nesting, in it's own way. May be a possible drawback is some performance hit, but I didn't do any benchmarks yet to see if it is actually a problem for my project.

wkirby commented 7 months ago

@eliias came back to note that the most recent published version (0.5.4) is working roughly as expected. I do have a very similar issue to @Phaengris (same error message, different proximal cause), but with a clean workaround.

The structure of our Y:Map is (roughly):

{
  title: "string",
  order: 0,
  data: {},
  dataValues: []
}

I can access all these values individually:

ydoc = Y::Doc.new
ydata = get_file_from_s3()
ydoc.transact { |tx| tx.apply(ydata.bytes.to_a) }

ymap = ydoc.get_xml_map('data')

ymap['title'] # => 'string'
ymap['order'] # => 0.0
ymap['data'] # => {}
ymap['dataValues'] # => []

So far so good! The error arises from trying to get all the values:

ymap.to_h # => Panic!
ymap.to_json # => Panic!

Results in the same error as above:

thread '<unnamed>' panicked at /home/runner/work/yrb/yrb/tmp/cargo-vendor/yrs/src/doc.rs:775:41:
called `Option::unwrap()` on a `None` value

The workaround, though:

ymap.as_json.to_h # => { title: 'string', order: 0.0, data: {}, dataValues: [] }

Anyway, from my perspective this issue can be closed out. Appreciate all your hard work!

writercoder commented 1 month ago

So it looks like nested maps don't work as the only way to create a Y::Map object it to use Y::Doc#get_map as the Y::Map constructor crashes.

In javascript I can create and assign Y.Map instances without errors.

I can work around this by changing how I'm modelling data but I'd love to be told I'm wrong here!