apigee-127 / sway

A library that simplifies OpenAPI (fka Swagger) integrations/tooling.
MIT License
190 stars 92 forks source link

Ultra-light version for validation purposes only #68

Closed zanona closed 6 years ago

zanona commented 8 years ago

Hey there, I am currently using sway to validate server requests against swagger definitions which is really great. However, I have noticed that at the moment sway is quite slow due to the fact it is currently using dependencies such as json-shcema-faker having a massive amount of information embedded into the library, while that is great for generating data, it might not be as efficient for those who only need to validate swagger definitions, so I would like to ask if there is any way to reduce sway to an optional lighter version that would provide best performance results?

Thank you in advance

whitlockjc commented 8 years ago

Sway's size shouldn't be an issue on the node.js side of things. We've had concerns with Sway's performance before and most of it was because Sway fully resolves a Swagger document so if you have a lot of nested references and circular references and such, you can have a gargantuan document pretty easily. The easiest way to test this:

console.log(JSON.stringify(sway.documentFullyResolved, null, 2));

This will tell you how many lines the fully resolved Swagger document is.

If you have a document that can reproduce this, I would love to spend some time doing some performance analysis. I am sure size could have something to do with slowness but I wouldn't expect it in node.js land because none of these larger, slower dependencies are in the chain for validation.

I realize this document might be sensitive and if that is the case, I can sign an NDA and we can do this offline or whatever is necessary. I'd love to fix this.

zanona commented 8 years ago

Thanks @whitlockjc, I am not sure if this is the expected loading time, perhaps I might being too picky on this. please let me know if that is normal:

test.js (without sway)

console.time('no-sway');
console.timeEnd('no-sway');

output

sh-3.2$ node test
no-sway: 0.891ms

test.js (with sway)

console.time('sway');
require('sway');
console.timeEnd('sway');

output

sh-3.2$ node test
sway: 483.111ms

Meaning that by just adding the library, without loading any definition, there is a delay of 482ms?

whitlockjc commented 8 years ago

Any way you can tell me the output of the following:

To run this, npm install -g json-refs. Also, if your input is YAML, add the --yaml flag to each json-refs command prior to piping to wc.

This will help me start the process of figuring out what is going on. Behind the scenes, Sway does the following to process a Swagger document:

At one point, I started the process of fixing performance concerns before they were reported. Since you've reported them, I think some of the enhancements I had started but not finished will likely be necessary but I really need to have an idea of what I'm working with.

zanona commented 8 years ago

Actually what I have mentioned on my last response is what is taking more time in my opinion, not the time for parsing the definition itself, but instead the time it takes just for requiring sway which is approx. half second? So if you imagine on the server where I am validating incoming requests against the definition, half second is spent only requiring sway? the parsing/validation process is actually pretty quick so I think it would be ok in this aspect?

whitlockjc commented 8 years ago

Oh, gotcha. I didn't notice that. So just a require('sway') takes half a second? Holy crap. I guess that would be due to the size. I can think of two large libraries that could be the cause: js-yaml and json-schema-faker. I will look into it further.

zanona commented 8 years ago

Yes! :D Thanks Jeremy. That was my point about json-schema-faker, js-yaml seems pretty good and performant actually. but the schema faker has a lot of additional fake data as you can expect :) hopefully there's a way to only require those when necessary or create a separate more performant version of sway perhaps?

Thanks again

whitlockjc commented 8 years ago

I'll figure it out. json-schema-faker has caused a number of contention issue for me, like the bloat added to my browser builds. But rebuilding a mock library is something I didn't want to do. Maybe it's time to do that...

zanona commented 8 years ago

Thanks Jeremy, I might be able to help since this is something I will be implementing on my current project. Could you please let me know which methods the API rely on json-schema-faker? In fact I wasn't really sure what was the dependency reason for validating definitions since it wouldn't need to mock data? perhaps its being used to create new operations instead? thanks again On Mon, 14 Mar 2016 at 17:44, Jeremy Whitlock notifications@github.com wrote:

I'll figure it out. json-schema-faker has caused a number of contention issue for me, like the bloat added to my browser builds. But rebuilding a mock library is something I didn't want to do. Maybe it's time to do that...

— Reply to this email directly or view it on GitHub http://mandrillapp.com/track/click/30070164/github.com?p=eyJzIjoiQWFXd19RWGh0bXU0S1dVOHZVQ1diLV9QZ1JZIiwidiI6MSwicCI6IntcInVcIjozMDA3MDE2NCxcInZcIjoxLFwidXJsXCI6XCJodHRwczpcXFwvXFxcL2dpdGh1Yi5jb21cXFwvYXBpZ2VlLTEyN1xcXC9zd2F5XFxcL2lzc3Vlc1xcXC82OCNpc3N1ZWNvbW1lbnQtMTk2NDMwMDM5XCIsXCJpZFwiOlwiOWFjY2IzYTYwNmFlNDVlMjhkNWYzMTM3MzMwNDM3MmRcIixcInVybF9pZHNcIjpbXCI5YTNjODcwYTZjNTQ3ZjQwZjNkMTI0Mjk1YjZkM2MwOTg5ODk5ZjA0XCJdfSJ9.

whitlockjc commented 8 years ago

Well, Sway isn't just a validation API. I could break the validation out, and I might end up doing that for https://github.com/whitlockjc/swagger-lint, but json-schema-faker is for mocking data. There are a number of places where we do this:

swagger-tools had this feature but it was locked up in the swagger-router middleware and it was a homegrown feature.

As for why json-schema-faker is so bulky, it's because it has a ton of sample mock data that is loaded at runtime. We've already tried to make things cheaper by loading only the en locale but apparently that isn't enough.

IvanGoncharov commented 8 years ago

@whitlockjc I also use only Swagger validation functionality in api-spec-converter and this lib is used in a browser. So I pay a price for functionality that I will never use.

I would suggest creating separate lib just for validation of Swagger(without external $refs). This will keep lib small, synchronous and will give you control over how you resolve external refs. Also, it will result in faster release cycles.

whitlockjc commented 8 years ago

That's likely what we'll do. Right now I have started a project called swagger-lint and pending a rename, we could move the validation parts of sway into it and use that as the lib for simple validation.

I apologize for the longer release cycles but I'm not working much on Swagger tooling at work these days. If anyone has time to help, I'd greatly appreciate it.

IvanGoncharov commented 8 years ago

I imagine that swagger-lint needs to handle external references and also with YAML specs so it will include both json-refs and js-yaml. And potentially other libraries. Right?

In my case, I can't convert spec without resolve external references so the output is always self-contained. I think many other tools already have everything to work with swagger(parsing, resolving, ...) and they just need validation.

If we can extract minimal version of validation library, just validation code without anything else. With one function which accepts only parsed Swagger in the form of JS object(no filenames or strings). And return back an object with warnings and errors attributes. All errors a handled through exceptions.

@whitlockjc What do you think?

I apologize for the longer release cycles but I'm not working much on Swagger tooling at work these days. If anyone has time to help, I'd greatly appreciate it.

This functionality is very important for me. So I can extract code into the library that I described above and support it in a long run. Similar to swagger-converter, just create an empty repo under your's account. And I will start submitting PRs to it.

whitlockjc commented 8 years ago

So we want another library that is just validation, no loading/reference dereferencing/..., and then sway would use that? In all honesty, the size of json-refs and js-yaml is minimal, especially to the problem originally brought up relating to json-schema-faker so yet another library seems painful but if it's required, I'm fine with it.

Before we create any new repository, please let me check with Apigee first. Sway is under their organization right now and I need to make sure I can break code out and if so, how we need to do it. I'm sure it's fine but it will likely have an impact on where it's created. I'll see if I can get sway and this new repository moved to whitlockjc so that we don't run into these issues in the future.

IvanGoncharov commented 8 years ago

You right json-refs is small but it required async API. People use different approaches for resolving refs, for example swagger-parser use it's own but it also implement validation: https://github.com/BigstickCarpet/swagger-parser/blob/master/lib/validate-spec.js And it has checks not supported in sway, so people who use my API collection report validation issues from swagger-parser.

So I need to integrate both validators into my CI and, more importantly, keep them in sync with each other. I have nightmares about some 3rd validator became popular :(

IMHO, I think the simple library is the more adoption it will get as part of 3rd-party party libs/tools. I will add missing checks into this minimal lib and will try to convince swagger-parser author to use it. Moreover, all tools that use Swagger just as output format don't require any parsing or resolving.

I'm sure it's fine but it will likely have an impact on where it's created.

Any account will work for me.

If we can extract minimal version of validation library, just validation code without anything else. With one function which accepts only parsed Swagger in the form of JS object(no filenames or strings). And return back an object with warnings and errors attributes. All errors a handled through exceptions.

@zanona What do you think? Does it meet your requirements?

whitlockjc commented 8 years ago

json-refs uses an async API for async operations, there's no way around that. Muddying the API to solve the perceived need to do things sync doesn't make sense to me because every API that does resolution would require some *Sync equivalent and that's just ugly. As for swagger-parser having validations sway doesn't, I'd be interested in hearing about them but just know, the validations Sway does is based on Swagger's specification and is not based on opinion.

I still don't know why swagger-parser did its own thing when swagger-tools was already out. While I understand competition is good, I struggle to see how anyone benefits from reinventing the wheel and not contributing to something to make it better. But I gave up trying to understand that a while back. ;)

In the end, I just wanted to provide a useful but simple object model and API around Swagger and its validation. Anything you think we can do to make that better is well received.

IvanGoncharov commented 8 years ago

json-refs uses an async API for async operations, there's no way around that.

I can't agree more but only if work with files which can contain external refs. But if you know for sure that you don't have any external refs at all, an async interface is unnecessary complicates code.

See validate function in this example: https://github.com/lucybot/api-spec-converter#advanced-features api-spec-converter always resolve all external refs, without it you can't convert between formats. But users of this lib forced to use to use callback or promise in order to do validation.

As for swagger-parser having validations sway doesn't, I'd be interested in hearing about them but just know, the validations Sway does is based on Swagger's specification and is not based on opinion.

I totally agree that it should stay like that. In particular situation that I experienced it was this check:

If type is "file", the consumes MUST be either "multipart/form-data", " application/x-www-form-urlencoded" or both and the parameter MUST be in "formData".

And I can think of other useful checks for example, check that enum values correct according to the type definition. Since it very frequent mistake that people add null as one of the enum values. https://github.com/swagger-api/swagger-js/issues/507#issuecomment-115322916

But this issue is not about that. An important thing is I want to contribute such PRs but preferably only into one tool.

I still don't know why swagger-parser did its own thing when swagger-tools was already out. While I understand competition is good, I struggle to see how anyone benefits from reinventing the wheel and not contributing to something to make it better. But I gave up trying to understand that a while back. ;)

It's real world people always have their own motives. But important things is to find points where you can collaborate with them and create win-win scenarios.

whitlockjc commented 8 years ago

The missing validations:

  1. Sway should be doing that but maybe I didn't port that over from swagger-tools
  2. swagger-tools use to do that as well but it wasn't ported over as I figured z-schema would handle it but if it's not, then it's likely due to Swagger not really using full JSON Schema and we should fix this
whitlockjc commented 8 years ago

If you know they're missing, file bugs and we'll get it sorted for sure.

IvanGoncharov commented 8 years ago

I believe that current approach is too complicated you need to traverse JSON Schemas with your own code. It's very error-prone and also duplicate things which z-schema already do. So I'm working on https://github.com/zaggino/z-schema/issues/114 As a nice side effect, it will improve performance because Swagger file will be traversed only one time by z-schema. For example, you want to validate parameter you need to check 3 places: parameters on top level on an individual path and inside each method. Or just bind custom validator to #/definitions/parameter.

But this beside the point of this issue, important thing that I want to ensure interoperability for all my clients. That mean I want to contribute my work into a library which is widely used. That was my motives for investing a huge amount of time into swagger-converter and I already see benefits of that decision. And I want to do the same for validation library since my entire project based on idea of interoperability of Swagger files.

IvanGoncharov commented 8 years ago

Hi @whitlockjc Do you have any updates?

whitlockjc commented 8 years ago

No, I'll look at it this weekend. Day job has kept me pretty busy the past few weeks.

IvanGoncharov commented 8 years ago

@whitlockjc I can help. Ping me when you create a repo for it. It would be superb if you setup Gitter for this new repo, so we could discuss things over there.

seriousme commented 8 years ago

I have been playing around with swagger-js and I thought that the 1.5 MB uncrompressed was large, but loading 1.1 Mb compressed in a (mobile) browser to be able to call a swagger service is way to much.

Having a swiss army knife with 50 tricks is nice, but when you need to peel potato's its not really convenient. If sway is to be the official JS library then something more modular would be much appreciated.

A number of features, like mock generation, are brilliant to devs, but won't be used in production. So being able to drop that weight would be a big plus imho.

Btw: https://github.com/signalfx/swagger-client-generator does it in 17Kb minified including client side validation, however that only supports swagger 1.2 :-(

whitlockjc commented 8 years ago

I will be working on this starting today. Once I have the repository setup, I will alert you all and I'll make sure Gitter is enabled so we can collaborate. Thanks for your patience.

IvanGoncharov commented 8 years ago

@whitlockjc Thank you for the update. Yes, it would be great.

IvanGoncharov commented 8 years ago

@whitlockjc Any progress on this one?

whitlockjc commented 8 years ago

I started the work yesterday.

whitlockjc commented 8 years ago

Here is the repo: https://github.com/whitlockjc/oas-validator I've not put anything into it yet but just know I'm working on it. I plan on releasing Sway today, v1.0.0, and after that I will migrate sway to the oas-validator. Feel free to watch that project as well as this issue for updates.

whitlockjc commented 8 years ago

sway@v1.0.0 has been released. I will be working on breaking the validation support out into its own library linked to above.

whitlockjc commented 6 years ago

I'm going to suggest we use #187 to keep track of this for now on. The idea will be to break sway up into smaller, more specific components.