capnproto / capnproto-java

Cap'n Proto in pure Java
Other
390 stars 86 forks source link

Parsing Schemas #136

Open devinrsmith opened 9 months ago

devinrsmith commented 9 months ago

I'm trying to use tooling to parse the code-generated Schemas in java (the org.capnproto.SegmentReader fields w/ hex names under the nested Schemas class).

I assumed the struct would be Node from https://github.com/capnproto/capnproto/blob/v1.0.1/c%2B%2B/src/capnp/schema.capnp, but that structure doesn't seem to parse correctly (tried both org.capnproto.Serialize and org.capnproto.SerializePacked).

Is there a different schema that is necessary to parse these fields? Or a different way to decode these bytes in java?

Thanks!

dwrensha commented 9 months ago

I suspect the problem is that the functions in Serialize expect a segment header, while the buffers have no header -- they are intended to be read as single-segment messages.

What are you using these schema buffers for? Eventually I'd like to add support for runtime reflection, like I recently did for capnproto-rust: https://dwrensha.github.io/capnproto-rust/2023/05/08/run-time-reflection.html Would that cover your use case?

devinrsmith commented 9 months ago

I think runtime reflection would work, but it may be technically more powerful than I think we need. Right now, we're considering doing our own Java-based code-generation step to produce java that can use the capnproto-java generated java, and ideally would like to use the known schema to accomplish that. I might call this "compile-time reflection" since we don't need the schema structures during runtime, but I think that's just nit-picking on semantics and it's probably same tooling, just whether you depend on it during runtime or not. (BTW, I do like the lean-and-mean approach you've taken w/ runtime; I suspect if there was a reflection or schemas package that was able to parse schemas, that could allow users to opt into "runtime reflection" when necessary.)

Of course, it could be argued that we should just hook into capnp compile with our own compiler out of the gate, but I suspect that's a lot of re-building we'd have to do. It might be more feasible for us to extend the capnproto-java compilation if it were written in java (https://github.com/capnproto/capnproto-java/issues/111), but regardless, that seems like a larger undertaking.

devinrsmith commented 9 months ago

CodeGeneratorRequest and technique shown in https://github.com/capnproto/capnproto/issues/673 may be an appropriate workaround to get this working in java for our needs; I'll try it out and report back.

devinrsmith commented 9 months ago

I was able to successfully read a binary schema in java; I did need to make a small modification to schema.capnp to make it java-friendly (I'm not sure if there's a way to do that natively from the CLI instead?).

My notes https://gist.github.com/devinrsmith/74e0a9b230f6bb3975beebed3ec8c253.

It seems like "runtime reflection for capnproto-java" would essentially entail publishing the code-generation from schema.capnp as a new jar (or, part of runtime), and attaching the binary Node data as part of the code-generation process?

For reference, the schema.capnp java file is ~360kB, the compiled class file is ~9kB, and zipped up class file is ~2.7kB.