Closed figuerres closed 5 years ago
Thanks for the note on Teams. I will investigate further. I've been trying to setup another channel so people can ask questions without necessarily having to post an issue.
I've also seen some people trying to handle security with API versions and I agree that it's the wrong way to go. I suggest that your APIs first enforce strict security for unauthorized access. You likely already have that. The second part is security by obscurity. In other words, you want to filter down your Swagger documents so that APIs that are not available are not visible. This is not a security measure, but a convenience. Your security provisioning should be the thing that prevents clients from using APIs they are not supposed to. The URLs or happenstance discovery of an API need not be secured IMO.
To achieve this filtering, you simply need to trim down the Swagger documents based on the calling client. The way that is implemented depends on the Swagger framework you are using. If you're using Swashbuckle, I believe this would be achieved with a custom ISwaggerDocument. You should be able to use descriptions provided by the API Explorer to determine which actions (e.g. APIs) should be included or excluded. The test criteria is up to you, but it can be with the AuthorizeAttribute, another type of metadata, or even some type of custom mapping.
Hopefully, that starts you on the right path.
@commonsensesoftware Hello and thanks for the reply.
one item to note is that the api explorer takes a lot of time when i call it to get the api descriptions collection to generate the swagger, i have not timed it but we have a mixture of WebApi 2 and OData 4 that totals about 640 methods and i can see in Visual Studio that he swagger creation is very fast but the call _apiExplorer.ApiDescriptions took all the time. like minutes of time! most of that was due to our having a large number of OData methods for a large data model. if i comment out the Odata config.AddODataApiExplorer(); then the time is instant (running 3.0 beta ).
security ; right now it seems like securing swagger json is not thought of in generating swagger by the projects i have seen. One comment i got was that adding security would break tools that generate client code as they do not have support to handle it. i think the Open API spec should be updated to address this up front that serving up a json may require user authorization / authentication as a part of the flow. and that the UI may also have the same need.
Authorize Attribute: yes i have a custom version of that and i am in the process of how to unify my current use (when the api is called by a client) with the Swagger generator I am also working out how we will do an api key style of call to the services and do i map an api key caller to a kind of "User" to map roles / claims to.
OData: of the 3 swagger generators i have seen (Swashbuckle, NSwag and Swagger-Net) only swashbuckle has had an OData option, I am working on a copy of Swagger-Net to add OData to it by using the ApiExplorer. it seems like most folks do not use OData and so they do not even try to support it. It would be great if more folks would support it. but i wonder if OData is going to just be an edge case and never really get much use ?
lastly: community, i have been around for a while, i started back in the days of VB 3/4 , Microsoft Quick C and all the old stuff. I used to be a part of Channel9.msdn.com back when it had forums today what i see is that Microsoft now has in effect 10,001 different places to post things and no unified way to deal with things. I am have seen the new github / nugget packages as both a blessing and a curse sometimes package updates have breaking change impacts that teams are not aware of and users get stuck with not being able to take new packages due to the dependency chains of all the packages we use. Monolithic packages like the old way of doing .net have issues but the new model also has issues. i am not sure which is really better. i wonder if anyone at Microsoft is working with all the teams to do some cross checking on the fragmented landscape we now have.
The API Explorer design in Web API is really sub-standard. I effectively had to reimplement the whole thing, which was painful in and of itself because so many of the original pieces rely on internal types and methods. I'm not sure what the right balance is for dynamically generating the Swagger document for a large API surface. The API surface should be consistent between versions. An alternative approach could be use the API Explorer with a Swagger generator to create the document upfront and then use a custom generator that merely serves up the document from disk or a data store. Then you only pay the cost at generation-time. We've seen the same type of problems generating large DISCOs and WSDLs in the 'ol SOAP days.
I don't profess to be a security expert, but I'm not sure what advertising the authn and authz mechanisms would buy a client. I find it unlikely they'll know what to do simply by looking at a Swagger document. I think would this would present more risks than benefits as you'd be advertising to a potential attacker exactly what authn or authz they need to make a successful attack. Certainly filtering or trimming down a Swagger document makes sense, but that's purely a convenience. The Swagger document itself doesn't provide any security. If client can figure out or otherwise discover an API endpoint, they can attempt to call it regardless. It can be argued that even filtering down the Swagger document doesn't buy you much because it's only security by obscurity; the API is what decides whether a client can access an endpoint. Perhaps there's more to your scenario that I don't fully understand.
Sounds like a reasonable approach.
The funny thing is that OData has had strong client-generation since the beginning. The EDM is always accessible via the $metadata
endpoint and can be used to generate any type of client. I suspect the desire to support OData with Swagger has really been to get the other benefits like the UI and the Try It feature. I know I was initially left scratching my head as I consider the EDM and Swagger document two different representations of the same thing. Something you might consider is the ODataSwaggerConverter. This actually converts the EDM to a Swagger document. I forget which version of the OData library this was added to, but even if you have to port it, it should be a relatively small task. Since API versioning already creates an EDM per API version, it should be straight forward to create a Swagger generated based on this approach. I suspect that will solve the performance problems of #1
too. You can probably even meet your security concerns or filter this way albeit with some additional work. You might need to consider using annotations in the EDM to decide whether entities or parts of entities are published.
I don't really have answers here, but I have a few opinions. There is governance over released OSS bits, but it's more compliance than anything else. The big ticket items like ASP.NET tend to be ok. They are owned and managed by a product team, but they just happen to be doing all their work transparently on GitHub. The small ticket items are good ways to get ideas out there so there is feedback, interest, and so on. These projects may never evolve into anything substantial. The enigma is more of projects like this one that start small and become big. I'm an army of one running this project and while I engage with the ASP.NET team, I'm not part of their team nor I have I ever been. While I would welcome help, being left to my devices has allowed me to drive things the way I see fit (with community feedback - of course). The community response has been positive, so the powers that be seem to be fine with this arrangement. If there was a large amount of customer dissatisfaction, I suspect someone would step in, if not just to make the community happy again. Beyond that, I don't know that anyone actively monitors anything. Compliance now ensures that official Microsoft repos are under the Microsoft organization on GitHub. Similarly, NuGet packages must be co-owned by the Microsoft organization. This helps the community identify official Microsoft artifacts and helps them track down where to find their sources.
some info on the time taken to generate: returned 717 elements, time taken 10:19.042 , this weekend i might try and see if i can find some details on what took most of the time, the same set of apis using an older swagger is something like 30 seconds to possibly 2 minutes , have not timed it before as it was not this slow.
and we are looking at static files and other options also...
Were you able to come up with any other information? When you say older Swagger, I assume this is another library or not versioned. It's certainly possible there are some inefficiencies in the discovery mechanism. The built-in support in Web API is very weak and virtually unextensible.
It's also possible that there is a lot of type generation going on behind the scenes. API versioning for OData supports automatic type substitution when the specified entity model type (e.g. class) doesn't match the EDM type. Swagger libraries like Swashbuckle rely on using Reflection to determine which properties are available, which may not necessarily match the EDM. Type substitution generates a dynamic type on the fly that looks exactly the way the EDM says it should. If the specified type already matches the EDM type, then no substitution is necessary. There is obviously a cost at Reflecting whether the substition is required, but I'd be suprised to see it take something like 10 minutes.
I just added a bunch of fixes and improvements to type substitution that will be available in 3.0.0-beta2
. I expect to have that published by next week or sooner. That may afford some performance improvements for you.
@commonsensesoftware sorry been hip deep in my coding and testing....
short history, we started the current project just under 3 years ago. when we started i had Swashbuckle and the Microsoft packages from that time frame to start the code. some of the updates had breaking changes in different packages so we have been slow to update the packages till just the last month when i was able to update them and be sure that i did not have dependency issues that broke things.
the web services are about 20-30% web api and about 70% OData. all of the attribute routing is on the web api controllers and they are all under /api/ for the base route. all of the Odata are under /atrackapi/ and are all convention routed, no attribute routing.
the time i reported was using a Stopwatch object right at the call to get the data from the api explorer Start() --call for collection Stop() console write line the time taken. so the time is very real.
i am getting some work completed to meet a deadline and then i should be able to make a test case and get more detail on what is going on.
also a note: we are using full asp.net not core at this time, when we find that Entity Framework for Core gets the support for SQL Geography solved we will want to see if we can move to core. Geography and mapping are critical to our work as the company is a logistics company shipping packages and tracking them, we have zipcode boundary data, and customer location data that we use to plan delivery routes, line hauls and all the planning and tracking for the business. some is done with Sql reports and some is done with Google maps and GeoJson data layers.
so EF on Core has been a key block on our moving to asp.Net core, it seems like this is now almost resolved so i hope i can make that move soon.
also if you know Barry Dorans (spelling may be slightly off) say hello to him - he was on the asp.net security team when i last was in touch with him, he used to be on channel9.msdn.com as blowdart.
as soon as i can i will provide details on what i find. i suspect that tracing out the Entity Models for the OData is the key to the time it has taken. we are not replacing them but we do have a fairly large set of related tables to model.
The issue is the way routes are matched. Direct Routing injects itself first and takes precedence over OData routes. For example, api/Orders
will be considered as better match over api/{odataPath*}
. OData has further weaknesses in it's conventions in that it doesn't care about the prefix. For example, from the pure EDM perspective, OData sees foo/Orders
and bar/Orders
as both the Orders entity set. In your case, as long as you don't have overlapping controller names between the two routing styles, you should be ok. For example, I don't think api/Foo
and atrackapi/Foo
will work. In fact, with Direct Routing (e.g. attribute routing), I don't know that even api/Foo
for both sides will work without using only Convention-Base Routing. Switching to and from OData via API versioning is something I thought was an important feature in very beginning of this project, but I haven't been able to bend Web API exactly to my will.
I believe the Swagger generation time to be very real. I don't know of any API Explorer implementation for Web API that supported OData. In fact, the ODataController decorates itself with [ApiExplorerSettings( Ignore = true )]
. Since that is the default, I can to come up with my own special configuration option as to whether that setting should be honored. This explain some of the extra time. Had I not already attempted to create an OData-based API Explorer for the 'ol Help Pages years ago, I may not have even taken up the challenge because it was horribly painful to implement (peruse the source - I dare you ;P). There was another library for Swashbuckle that made an attempt to support OData, but it required you to version things the way it wanted and it was largely only for Swagger and not a comprehensive API versioning library. I also don't believe that library has ever published support for ASP.NET Core - yet (if ever).
You definitely have one of the large API surfaces that I've seen, though not many folks give me details as to how many APIs they have. Even if the time is improved from something like 10 minutes to 10 seconds, that's still too long in my book (2-3 secs max). Since the Swagger documents are static on each deployment, I think you'd be better served by pre-generating the documents after each build and added as part of your deployment artifacts.
Here's a couple of projects that started inside Microsoft that you might find useful in achieving that:
@commonsensesoftware thanks for the links! i have been looking for V3 supporting generators and tools. just the other day i found the ms openapi but not yet had time to fully check it out.
OData and the plumbing - Yeah I have some ideas, i have a bit of code that i use to help my authorize logic that is just a basic catalog of methods and i am thinking that in time i would like to do that with a common code base, right now it's a refection of types in the current assembly that look like controllers based on the namespace and type name, not perfect but it lets me maintain a table of method names. we use that to map user roles and make out own Authorize Attribute work.
I tell my other developers that if an api call takes long enough that you notice then it's too slow, 30 seconds or less even for a deep api call is our general rule.
most are in the 1 to 5 second window for us i think.
by the way: with regard to the idea of a pre-generated swagger file ....
any thoughts on a run time handler that would filter the data to tailor it to the caller ? like user x should only see 10 api methods out of 200 that are in the master catalog
It's an interesting idea, but some more input is needed. I suspect that issue and time being spent is in the API Explorer during the discovery process. To that extent, I'm not sure how to achieve your goal. The filter process is based on discovered APIs. If you uncover that the time is being spent elsewhere, then it's potentially worth considering.
No part of API versioning or the API Explorer directly depends on Swagger. While potentially useful, I have to make sure new features are not necessarily specific to Swagger. The API Explorer, for example, can be used for other scenarios albeit that it's original support motivation was for Swagger.
thanks, as i get done with the current deadlines i do want to spend some time getting deeper into how the explorer works and what swagger wants to better understand both and see what can be done.
when i started on this project almost 3 years back Odata and swagger were brand new to me. i had a good background with ASMX and WCF and classic TCP stuff that helped me figure out a lot of this. my boss loves how i got swagger for us working, we are a very small team (3 total) so i know how you may feel trying to wrangle this code!
Here is one bit of news that i think points to the EDM data also. i updated my code that uses reflection to list my controllers and methods. it does not get into an return models, parameter models etc.... i get just over 110 controllers , the number of methods vary by controller but the total time was about a half a second.
this plus what i saw when i ran the explorer code with no OData mappings tends to really make me think that the problem is in the EDM data that it's collecting.
i am really wanting to grab a copy of my project and spend time this weekend to find the issues. if i get your source and add it to my project then i can profile and debug it as a whole and possibly find the places where it's going south.
will see what happens next....
Interesting...
BTW: I'm not sure if you're aware or using the built-in Web API services to do this. Even with your current implementation, Web API can do most of the heavy-lifting using it's built-in services. For example:
var services = configuration.Services;
var assembliesResolver = services.GetAssembliesResolver();
var controllerTypeResolver = services.GetHttpControllerTypeResolver();
var controllerTypes = controllerTypeResolver.GetControllerTypes( assembliesResolver );
var odataController = typeof( ODataController );
var odataControllers = controllerTypes.Where( type => odataController.IsAssignableFrom( type ) );
That should be more efficient, even only a little bit, than creating a custom resolution process with Reflection.
my code is basically like this:
` Assembly a = Assembly.GetExecutingAssembly(); types = a.GetTypes(); var controllerList = (from tx in types where tx.IsPublic && tx.Namespace.StartsWith("ADL.Web.Services.") && tx.Name.EndsWith("Controller") select tx).ToList(); foreach (Type t in controllerList) { MethodInfo[] methods = t.GetMethods(); var methodList = (from mx in methods where mx.IsSpecialName == false && mx.DeclaringType.Name == t.Name select mx).ToList(); foreach (MethodInfo m in methodList) {
` some details ommited to keep this short but as you can see i just have a set of loops and some filters, and checks on type properties and attributes. i will grab your code from above and see what it looks like ....
Interesting...
BTW: I'm not sure if you're aware or using the built-in Web API services to do this. Even with your current implementation, Web API can do most of the heavy-lifting using it's built-in services. For example:
var services = configuration.Services; var assembliesResolver = services.GetAssembliesResolver(); var controllerTypeResolver = services.GetHttpControllerTypeResolver(); var controllerTypes = controllerTypeResolver.GetControllerTypes( assembliesResolver ); var odataController = typeof( ODataController ); var odataControllers = controllerTypes.Where( type => odataController.IsAssignableFrom( type ) );
That should be more efficient, even only a little bit, than creating a custom resolution process with Reflection.
interesting results... overall time very close, about a half second.
controllertypes = count 114 - matches my code. odatacontrollers - odd that came back with 114 also .... something wrong there.
not sure if this make any diff but most of my OData controlers use a Generic Base class drived from the ODataController class
public class BaseOdataController<TDbContext, TIDataRecord> : ODataController where TDbContext : DbContext, new() where TIDataRecord : class, IDataRecord, new() {
that lets me write all my controllers with the same base code and have them all work in a standard way. we have a record audit ability that this works with so that on update the date and user id are automaticly updated and our Delete is a logical delete with a bitflag in SQL so we can un-do a delete if needed and also know who did a delete. that may not have made the filter wrong but figured you should know in case this or other code might be affected. the TIDataRecord goes with a simple interface that says the entity has to have a set of known columns.
That is odd. I double-checked and that should work. You can see how I make this test here. Bases classes are fine. Type.IsAssignableFrom
supports inheritance chains. It effectives translates to the equivalent of T is ODataController
.
I suspected the execution time would be close. The only advantage of my suggestion is not having to maintain a bunch of Reflection code. ;)
That is odd. I double-checked and that should work. You can see how I make this test here. Bases classes are fine.
Type.IsAssignableFrom
supports inheritance chains. It effectives translates to the equivalent ofT is ODataController
.I suspected the execution time would be close. The only advantage of my suggestion is not having to maintain a bunch of Reflection code. ;)
actually it did work, i made a mistake when i looked at the results!! end of the day and expanding the enumerable on the wrong property by mistake! i added .ToList() just now and i have 70 Odata controllers from 114 total controllers.
this also tells us that the basic getting the list of controllers is not a problem in the api explorer it is something after that.
Regarding the long execution time, when I was trying to generate Swagger myself using the api explorer I also noticed that it took very long, it seemed to be because of the GetClrType method which loops through all assemblies and types each time it tries to match a type.
This takes especially long if you have a decent amount of assemblies and EdmTypes so this is most likely a big contributor to the execution time.
I got it down to about 1/50th of the time taken by adding caching and adding all the types to a Dictionary of string and Type. I'll have a PR for that later this week so you can check it out.
@LuukN2 sounds about right, my instinct was that something like that was happening in the code but i had no idea where to look.
when the update is ready i can run it and get my timing data and we can see how much it helps.... if that 10 minutes can get down to around a minute that would be huge!!! then i would not use a disk based cache of the json at all, the swagger i have will do an in memory cache and that would make most calls be in the 1 second or less range and just take a hit when iis kills the worker process.
@LuukN2, nice find! I probably wouldn't have even spent time looking there. As I recall, I highjacked the CLR type mapping code directly from the OData source here. That was quite a while back and the entire OData team has changed since then. I wouldn't be surprised if things have changed in the implementation. It looks like the critical thing missing for caching is testing and updating the ClrTypeAnnotation associated with an EDM type. This would be the preferred method to optimize the lookup because all EDM-to-CLR type lookups are expected to use this approach, not just API versioning.
also i am not sure if or how or what the best thing to be done is but here is a problem i have had that is part of documenting an api:
lets say i have an entity model with a type called "Package" say that package has multiple references to related types like "Address" - ship to, ship from, bill to etc... and other relations like service type, status collection (created, dispatched,pickup, in transit ...)
now say i create an API like /api/package/track/{trackingNumber:string} and i want to return the "Package" type with related status history.
i have code that does this , it works, it tell EF and Json to not follow all the relations but i am returning all of the base properties of "Package" so i am not creating a seperate POCO class to do the return and that way if i need to include more i can w/o creating classes that mirror my EDM. that all works but if i try to attribute the api with a [ResponseType(typeof(Package))] then i get a very large json model, like 77,361 lines of json if it's formatted nice to read.
by the way before when i used swashbuckle this would just hang the swagger generator in the older versions...
so today is better, it does not hang.
but perhaps we could say IF type is an EDM type to only include the primary object and stub any child collections to foo: [] - array
otherwise the developer needs to create a dummy POCO class to limit the return model.
or have some kind of custom return type attribute that lets the developer override the model resolution somehow ??
just a thought ....
@figuerres what you're describing sounds like a feature that is currently supported. I might not have called it out in the wiki, but it is used in the OData sample projects. I call this feature Model Substitution. The idea is that you have a single POCO that is used for every version of a particular EDM type. In my experience, I've found that the EDM type tends to just add new members per version. It's totally conceivable and supported to create per version CLR types to match the EDM types, but that can be unnecessarily tedious.
What Model Substitution does is look at the source EDM type and compare it to the associated CLR type. If the CLR type is a superset of the EDM type, then a new type is dynamically generated that matches the structure of the EDM type. This also includes retaining any attributes that the CLR type may have defined on its properties. If the EDM and CLR type are structurally equivalent, then no substitution is performed.
The other supported scenario is for ODataActionParameters, which is a simple subclass of Dictionary<string, object>. Swagger generators are unable to produce anything useful from this because they have no idea what the keys and values will be. In this case, Model Substitution generates a dynamic type matching the definition of the action parameters. A client has no idea that it's actually realized as a dictionary on the backend.
Couldn't think of a single scenario where someone wouldn't want this behavior, so it's always on by default. You can see both of these concepts in action by running the OData with Swagger sample project.
The current, published support for Model Substitution does not account for self-references, back-references, and some collection scenarios. @LuukN2 has been helping out in this area and the forthcoming Beta 2 release will see support for these scenarios.
In terms of cutting down your model, unless you have optional members, I don't think OData will be happy with that when someone clicks Try It
. You do have the ability to provide your own examples on the Swagger side, so that may be a better approach. The default example generation attempts to compute default values for the entire request and/or response entity, which may be unnecessary.
slight confusion: my last comment was about returning an entity via a WebAPI controller NOT from an OData controller.
Gotcha. I've thought about that case, but there's no auto-magical way to do it; there's no advent of the EDM that knows what the model is supposed to look like. Using stand-in POCOs is probably the only way. The Swagger generators work from Reflection, so you can probably even make it work with internal interfaces that represent the shape you want. I haven't tested it, but I don't see why it wouldn't work. Theoretically, this should work even if you use Data Annotations as long as they are on the interface properties. It's extra work, but doable. #359 was proposing more comprehensive support for a concept like this.
and i know that for sure there is not any automatic way to handle it. just wanted to mention it as i have seen what happens by default. i am also thinking more and more that a POCO type will be the way to make it document. the only catch for the developers is that if the model chnages you also need to update that "extra" POCO class or the swagger will not reflect the actual return type.
Hey guys I do not know if what i did just can't work or what but here goes:
so i got the source from the site and had visual studio 2017 build the dlls , i then took a copy of my main project and removed the references to the pre-built packaged versions. i then added references to the new dlls ( two for web api, two for Odata) build works, try to run and it throws an exception inside the source t o the api versioning / explorer code that some factory was used the wrong way.....
so is there a fast easy way for me to test the code changes to see if it's faster ? or do i need to work out how to make nuget packages from the source to resolve some arcane details?
Swapping out the references shouldn't be a big deal. I do that frequently when troubleshooting repros. You might want to a full Clean and Rebuild to make sure everything is truly updated.
There a number of variables at play. I'm assuming you are working against the latest commit to master. It's possible there is a regression. It's also possible that your configuration is incorrect or your change is triggering recursion that it shouldn't.
Without even seeing anything, I suspect this could be related to Configuration.EnsureInitialized
. Avoid calling this directly and beware that it can trigger reentrancy. Uses of Lazy<T>
test for that type of scenario. You have the relevant parts of the configuration to share, I can help iron things out for you.
thanks for the items to check, i have also in the past swapped refs and not had a problem when i did that with other code. will have to check later and see, i hope it's just a rebuild thing, trying to leave the main project alone and just get the timing value! then if it's still slower use the vs code profile tool and see what it points at.
you'll also want to compare the Debug vs Release builds. You probably already knew that, but I wouldn't be surprised to see as 5-10% difference. For a large codebase like yours, I think that could make a significant difference. Looking forward to the results. Thanks.
Has this thread reached its conclusion? The latest changes related to this discussion are merged. Thanks to @LuukN2, there is a significant performance improvement in Model Substitution. This change didn't make it into Beta 2, but it's there nevertheless. Let me know. Thanks.
This issue has been open for some time and I think everything that can be answered in this context has been. If you feel it hasn't, reopen the issue or submit a new one. Thanks, this discussion drove a lot of new features and enhancements.
Hello there! Were you able to find a solution for this? We are in a same situation.. any thoughts on a run time handler that would filter the data to tailor it to the caller ? like user x should only see 10 api methods out of 200 that are in the master catalog
It really depends on your goals and the libraries you're using. If you're using Swashbuckle to produce the documents, you simply need a custom document generator or operation filter that knows how to filter things. I'm not familiar enough with the intricacies of Swashbuckle to tell whether there is anyway to access the current request. If not, that means you'll have to setup your own controller or handler to create a document generator than can receive the current request and user to apply the filtering to. Regardless, you'll still need your own mechanics by way of role, identity, or so other piece of information that you can interrogate from an action descriptor discovered by the API Explorer.
I hope that helps.
Thanks for your response! We are using NSwag. And for now, we are thinking to use an operation filter based on the user and their subscriptions to the api's. I was just curious if someone else had a different way of implementing this. Thanks!
Hello, first i did try to use the link on the site for a discussion using teams and it left me on a page with nothing about this repo. so i am posting this here.
my company has a fairly larg set of api methods that we use for our internal application. we want customers to be able to access a few api methods to integrate with us. i am looking for guidance on how to provide a single swagger / Open API web site to internal and customers where after login the user sees only the api's that they need to see.
i have seen some work done using versions to do this but that seems to me to be very flawed. but i can see that a developer might need to create code that thier role may not grant them full access to. is there any current work being done in this area ?
any examples of how to do this using Open ID / JWT token and user claims to generate different swagger json based on them ?