phenopackets / phenopacket-format

26 stars 10 forks source link

Should YAML subset be restricted to what is expressible in JSON (meaning exclude the ability to use references) #24

Open cmungall opened 8 years ago

cmungall commented 8 years ago

YAML allows us to reference the same object (first mention with &, future with *).

While we currently use this in the schema, this may be problematic in the packets. It could be confusing for producers: when to reuse vs when to duplicate? The semantics may be subtle here.

Also, this prohibits a translation to JSON, which is strictly trees.

Note that we do allow referencing of some entities via keys (see #22). But this is not at the level of YAML itself, it's at the level of our structural schema. Having two ways of doing this is definitely confusing

harryhoch commented 8 years ago

This would appear to be fundamental design decision. If encouraging adoption is the goal, I vote for simplicity and clarity, at least in initial versions. Perhaps start with the simple tree-version that is strictly compatible with JSON and then allow references in level n+1?

DoctorBud commented 8 years ago

I don’t think we should restrict ourselves to trees; there is no need to do so.

We can express graphs in JSON using the JSON-Graph pseudo-standard. I am most familiar with the Falcor implementation of JSON-graph. Falcor is Netflix’s open-source (Apache 2.0 License) Javascript library:

http://netflix.github.io/falcor/documentation/jsongraph.html <http://netflix.github.io/falcor/documentation/jsongraph.html>

But the great thing about standards is that there are so many to choose from!

https://github.com/jsongraph/json-graph-specification#graph-object <https://github.com/jsongraph/json-graph-specification#graph-object>

The difference between the Falcor version and the proposed specification seems to be in how flexibly you can refer to nodes; Falcor’s JSON-graph is more general and doesn’t assume only integers as indices. It does this by clearly stating that a value is a Reference, rather than making the idea of reference implicit. However, the integer-only node index constraint seems to suit our YAML format. The other value of the Falcor definition of JSON-graph is that it is more likely to be supported as Falcor moves from Developer Preview status to Released status.

Semi-related, I’ve been exploring the use of Falcor for web apps since the tech was released last summer, and have recently started building an experimental Falcor-ized SciGraph implementation so that I could determine whether the caching ability of Falcor will pay off in a better user experience and faster response. So far, it works as advertised. Assuming that Netflix continues to use it and promote Falcor, their definition of JSON-graph may be adopted more widely.

Dan

On Feb 15, 2016, at 4:37 AM, Harry Hochheiser notifications@github.com wrote:

This would appear to be fundamental design decision. If encouraging adoption is the goal, I vote for simplicity and clarity, at least in initial versions. Perhaps start with the simple tree-version that is strictly compatible with JSON and then allow references in level n+1?

— Reply to this email directly or view it on GitHub https://github.com/monarch-initiative/phenopacket-format/issues/24#issuecomment-184189227.

cmungall commented 8 years ago

Note that restricting to trees at the level of surface syntax does not preclude modeling graphs