raml-org / raml-js-parser-2

(deprecated)
Other
138 stars 53 forks source link

Parsing and validating are slow #729

Closed kevinrenskers closed 7 years ago

kevinrenskers commented 7 years ago

Please see https://github.com/raml2html/raml2html/issues/345 for the original issue and my findings. Basically, @tylercloke has a set of RAML docs that take about 4 minutes to parse in raml2html.

3 of those minutes are spent in raml.loadApi('doc.raml', { rejectOnErrors: true }), 50 seconds are spent in result.expand(true).toJSON({ serializeMetadata: false })

I could turn off the validation and shave off 3 minutes, but those 50 seconds are still a problem.

Just to make sure I checked the new load method that does the expand and toJSON steps behind the scenes, but that also takes about 50 seconds.

kevinrenskers commented 7 years ago

The numbers above were slightly off, here are more details:


console.time('Parsing RAML');
raml.loadApi('delivery-api.yml').then(result => {
  console.timeEnd('Parsing RAML')
});

Most basic version, 0.1 seconds.


console.time('Parsing RAML');
raml.loadApi('delivery-api.yml').then(result => {
  result.expand(true).toJSON({ serializeMetadata: false });
  console.timeEnd('Parsing RAML')
});

Add expand and toJSON, now it goes up to 60 seconds.


console.time('Parsing RAML');
raml.loadApi('delivery-api.yml', { rejectOnErrors: true }).then(result => {
  console.timeEnd('Parsing RAML')
});

Just adding validation brings it to 165 seconds.


console.time('Parsing RAML');
raml.loadApi('delivery-api.yml', { rejectOnErrors: true }).then(result => {
  result.expand(true).toJSON({ serializeMetadata: false });
  console.timeEnd('Parsing RAML')
});

Validation + expand + toJSON "only" totals 175 seconds. This is a bit weird to me, seems it should be higher based on the previous tests.

kevinrenskers commented 7 years ago

I also tried the brand new load method.

console.time('Parsing RAML');
raml.load('delivery-api.yml').then(result => {
  console.timeEnd('Parsing RAML')
});

That takes 60 seconds.

But if you add validation:

console.time('Parsing RAML');
raml.load('delivery-api.yml', { rejectOnErrors: true }).then(result => {
  console.timeEnd('Parsing RAML')
});

It only takes 5 seconds more, whereas with loadApi it adds almost 3 minutes. It makes me think that load doesn't actually do any validation even with rejectOnErrors?

tylercloke commented 7 years ago

I have a dataset to test on. Let me know and I can email it to the author.

ddenisenko commented 7 years ago

@kevinrenskers you can test that load validates by making some errors in RAML and checking errors property of the JSON being returned by load method.

Yes, load method is faster than the old loadApi->expand->toJSON sequence. But not that faster.

The answer why your loadApi tests show several times worse performance than that of load method is because by adding rejectOnErrors: true flag you force the parser to validate even before full expansion and JSON transformation is finished. In other words, the parser in your case is doing the same procedures several times. Please turn off the flag and instead check errors in the resulting JSON. Btw, you can not turn off validation, rejectOnErrors only controls whether during the initial loadApi (or the like) validation is forced and any errors cause immediate rejection, not whether validation is going to be performed or not.

@tylercloke Did I get it right that you do not want to share the set in the ticket and want to email it instead? @sichvoge could you please provide the email?

sichvoge commented 7 years ago

@tylercloke you send me an email to christian (dot) vogel (at) mulesoft (dot) com

tylercloke commented 7 years ago

@sichvoge Sent! Thanks

sichvoge commented 7 years ago

Thanks @tylercloke. @ddenisenko I have shared it with you.

sichvoge commented 7 years ago

@KonstantinSviridov @ddenisenko can you explain a little bit what you improved and maybe some new metrics, please?

KonstantinSviridov commented 7 years ago

@sichvoge There has been a performance gap which has only been visible in paticular cases, like the one we deal with -- multiple large files forming long !include chains.

Now processing the example takes about 5 seconds for me, while I started with more then a minute.