wolfe-pack / moro

Interactive documentation and programming with Scala, iPython notebook style.
http://wolfe-pack.github.io/moro
BSD 2-Clause "Simplified" License
19 stars 3 forks source link

Notebook file format that makes version control and merging easier #78

Open riedelcastro opened 9 years ago

riedelcastro commented 9 years ago

Notebooks are often collaboratively edited (e.g. the tutorial, or the book project). Any conflicting changes on notebooks are super difficult to resolve due to the json file format, and the fact that new lines in the raw text are escaped in json. So conflicts in the same cell (which can be quite large in my cases) are almost impossible to resolve easily.

Possible remedies: allow newlines in json (not sure if that's possible though), or use xml?

sameersingh commented 9 years ago

Is there code for such a format for serializing simple case classes? If so, it's all plug and play.

If not, we'll have to write our own serialization and deserialization that, well, I don't wanna!

I think all relevant code is in Document.scala. On Aug 22, 2015 3:04 PM, "Sebastian Riedel" notifications@github.com wrote:

Notebooks are often collaboratively edited (e.g. the tutorial, or the book project). Any conflicting changes on notebooks are super difficult to resolve due to the json file format, and the fact that new lines in the raw text are escaped in json. So conflicts in the same cell (which can be quite large in my cases) are almost impossible to resolve easily.

Possible remedies: allow newlines in json (not sure if that's possible though), or use xml?

— Reply to this email directly or view it on GitHub https://github.com/wolfe-pack/moro/issues/78.

riedelcastro commented 9 years ago

While I think we can reduce the number of necessary properties to be stored, such that own serialization should be super simple, json4s has XML support!

(my dream data format would still be text/markdown only, with scala cells markdown code blocks with some extra properties where needed, maybe specified through html comments)

sameersingh commented 9 years ago

I can try out the json4s XML serialization, if that's good enough. XML is not quite readable though, a better option might be to consider HOCON, but we'll have to write our own converters for the objects, which will be a pain.

I would be okay for a direct text/markdown format as well.

On Sat, Aug 22, 2015 at 4:10 PM, Sebastian Riedel notifications@github.com wrote:

While I think we can reduce the number of necessary properties to be stored, such that own serialization should be super simple, json4s has XML support!

(my dream data format would still be text/markdown only, with scala cells markdown code blocks with some extra properties where needed, maybe specified through html comments)

— Reply to this email directly or view it on GitHub https://github.com/wolfe-pack/moro/issues/78#issuecomment-133763346.

riedelcastro commented 9 years ago

I don't think XML will be a problem here, because in most cases it's the cell content that will dominate the file. The only thing that could be improved is using xml attributes for cell attributes, as opposed to child elements, but that may not be possible with json4s.

HOCON would be great---is there an automatic serializer for this?

sameersingh commented 9 years ago

No, no automated io for HOCON, but should be easier than from scratch.

On Sun, Aug 23, 2015 at 1:57 AM, Sebastian Riedel notifications@github.com wrote:

I don't think XML will be a problem here, because in most cases it's the cell content that will dominate the file. The only thing that could be improved is using xml attributes for cell attributes, as opposed to child elements, but that may not be possible with json4s.

HOCON would be great---is there an automatic serializer for this?

— Reply to this email directly or view it on GitHub https://github.com/wolfe-pack/moro/issues/78#issuecomment-133805446.