nodeca / js-yaml

JavaScript YAML parser and dumper. Very fast.
http://nodeca.github.io/js-yaml/
MIT License
6.22k stars 765 forks source link

Can't dump non-plain objects #132

Open honzajavorek opened 10 years ago

honzajavorek commented 10 years ago

I need to convert "almost plain" objects to YAML. By "almost plain" I mean they're just data containers as if they were created by literals, but with different names. E.g. output of the protagonist library.

However, js-yaml fails on those with unacceptabe kind of an object to dump [object Result]. There could be an option to treat those as plain objects (json2yaml does this by default).

Workaround for this is to do

var yamlResult = yaml.safeDump(JSON.parse(JSON.stringify(result)));

...but that's not very efficient.

puzrin commented 10 years ago

I'm not sure, that something will become better if this feature added. For custom types JSON.parse(JSON.stringify() is the most simple and fast way to get plain object.

honzajavorek commented 10 years ago

But isn't it unnecessarily memory-consuming in case of really big (megabytes) documents?

puzrin commented 10 years ago

Are you really serious about memory consuming :) ? I don't like to add features for hypotetic reasons.

honzajavorek commented 10 years ago

I am building an API for http://apiblueprint.org/ (let's say it's more production-ready version of Examples section on that site). People can send very, very big blueprints - real story, we are experiencing this on http://apiary.io :-) I need to convert parsed ASTs into YAMLs and I am currently using js-yaml for this job, but with the JSON.parse/JSON.stringify workaround I am afraid it's not bulletproof enough for our future needs.

dervus commented 10 years ago

For non-plain objects you should define YAML type definitions. We have no proper documentation for this part of the API, but there is a good example: https://github.com/nodeca/js-yaml/blob/master/examples/custom_types.js

puzrin commented 10 years ago

If you cares about very big objects and speed, but need only plain types support - just write your own dumper and hardcode your own settings. Seriously. That's a bit another task.

honzajavorek commented 10 years ago

Okay, thanks for pointing me to the example. The docs are very brief on this. The important message for me is that it's possible to avoid JSON.parse/JSON.stringify in the future with writing my custom dumper using the example you linked as a reference :)

dervus commented 10 years ago

Also, for your case:

var MyType = new yaml.Type('!result', { kind: 'mapping', instanceOf: Result });
var MY_SCHEMA = yaml.Schema.create([ MyType ]);
var dump = yaml.dump(data, { schema: MY_SCHEMA });

It will give JS-YAML all the information to properly load/dump your Result objects without recreating them in such a hacky way.

honzajavorek commented 10 years ago

Whoa, that looks a lot simpler (I noticed the previous example did some more complicated manipulations as a showcase, so it was rather complex). This is great! Thanks!

dervus commented 10 years ago

Ah, one more remark. If you're building a web API, you should probably create the schema like so:

var MY_SCHEMA = yaml.Schema.create(yaml.DEFAULT_SAFE_SCHEMA, [ MyType ]);

To inherit your schema from the safe one. DEFAULT_SAFE_SCHEMA is the schema used by safeLoad and safeDump functions.

puzrin commented 10 years ago

@honzajavorek if possible, please publish several examples of very big dumps, that can happen in your project. We are about to add circular links check, and i'd like to test perfomance change on something real for different strategies..

honzajavorek commented 9 years ago

@puzrin I can't disclose the real data, but I guess I could generate something close-to-reality.

puzrin commented 9 years ago

@honzajavorek That will help. I need typical big file + structures, similar to ones from real life. Final values do not matter.

honzajavorek commented 9 years ago

@puzrin I think I found a way I can provide something possibly useful.

1A size:>100000 path:/apiary.apib is a search query on GitHub, which lists you relatively large API Blueprints, which are public. We have larger ones, but I can't disclose those. You could easily get much larger API Blueprints by joining them (without headers of the latter ones) or just copypasting their internals so they get bigger. I think that wouldn't affect their realworldness much.

Having such blueprints, you can then use protagonist to convert them into JavaScript objects and then you can easily feed js-yaml with it to get YAML parse result. Let's say it could look roughly like this (coffee-pseudocode):

protagonist = require 'protagonist'
yaml = require 'js-yaml'

protagonist.parse blueprintCode, (err, result) ->
  if err
    ...
  else
    yaml.safeDump JSON.parse JSON.stringify result

...where blueprintCode contains a string with the blueprint's Markdown. It's actually exactly what I did for the API - see this and this.

puzrin commented 9 years ago

@honzajavorek I'd prefer to get ready result without thinking how to compose it :)

honzajavorek commented 9 years ago

@puzrin So I finally found some time to create it for you. Here it is: https://github.com/honzajavorek/large-blueprint-yaml, especially the blueprint.yaml file.

dervus commented 9 years ago

Thanks for it!

Also, I just realized that the answer I gave to you is wrong. :-} There is certain lack of possibility to dump custom host objects without reconstructing them to regular objects.