Don't require an API Key

seanherron commented 10 years ago

Usability would be seriously improved if you didn't require an API key for requests under maybe 1,000/day per IP. As it stands, I can't do anything entirely client side right now which sucks. Plus I don't want to have to sign up just to see if I would want to use it.

RobertLRead commented 10 years ago

Hear hear! BIS! +1! Vote this up!

MikePulsiferDOL commented 10 years ago

API usage metrics are a great thing, especially when I get asked for them from upper management. The best way to detailed, per-app metrics is via a key. Now, you'll want to make it stupid-easy. That would mean instant, self-serve registration (meaning, no approval needed). For that self-serve portal, any way you can utilize single-sign-on, the better. In addition, if you provide a robust sampler (think Flickr), people can play with it without having to build anything first.

seanherron commented 10 years ago

@MikePulsiferDOL Which is why it's so great to require a key for high-volume use (which provides meaningful metrics) while not requiring one for someone just wanting to do a few hits to test functionality. By requiring a key right out of the gate you immediately are going to turn away some people no matter how easy the registration process is.

This also doesn't address the other important point of client-side applications.

MikePulsiferDOL commented 10 years ago

@seanherron The problem would arise when those applications that make less than, per your example, 1,000 requests per day, are a statistically significant sample of the calls against your API. You could conceivably have the vast majority of your traffic fly right under the radar.

As an obstacle, if the process of obtaining a key is as simple as entering only an email address and being returned a key is too onerous or troublesome, they're not going to put the effort into doing any real work with the API.

As for client-side apps, I have a sample iOS app that uses our SDK. I just plug in the basic info (URL, arguments, key) and presto-chango I see the data in the log output. It's my easy-peasy means of playing with any RESTful API without having to exert any real effort (redundant, I know :) ).

seanherron commented 10 years ago

@MikePulsiferDOL You can still capture analytics of those entries. By comparing IP addresses you could even make a reasonable estimate of where requests are coming from. You may just not have as good of metrics over time or have contact information for that user.

What would be your recommended way to build a application that queries this API entirely in client side javascript, perhaps a d3 visualization or the like?

arowla commented 10 years ago

As you know, we proxy everything in our API through api.data.gov. This gives us some important benefits, many of which have already been mentioned. That said, I could envision having a rotating demo key available (and publicized) for people to grab and use instantly. It seems like something like this could be pretty easily programmed into the API gateway. Keys could be good for 24 hours/48 hours/a week... but then they'd either have to pull up the new one or register.

arowla commented 10 years ago

cc @GUI

seanherron commented 10 years ago

re: api.data.gov, you can set a custom rate limit to have, for instance, a limit of 200 requests per hour per IP address or 10,000 requests per day with an api key. screen shot 2014-04-04 at 4 22 50 pm

seanherron commented 10 years ago

Though @GUI can probably chime in here more, I believe these rate limits are OR and not AND, correct?

dwcaraway commented 10 years ago

Demo, severely rate-limited key would make sense as part of http://18f.github.io/fbopen-widget/

dwcaraway commented 10 years ago

As an aside, I share @seanherron 's concern about having to use an api key for a simple client side app (earlier linked fbopen widget has the same problem of exposing the api key to the client). For read-only actions of non-sensitive / public info, it would be great to just omit a key or to have a key that only supported read-actions.

arowla commented 10 years ago

I tend to come down on the side of always being able to have analytics, no matter how small the client. On past projects I've worked on, even when people were doing pretty small-scale/short-lived projects, it was nice to be able to see their activity and have an email address to reach out to them in case there were questions, and also for reporting. As @MikePulsiferDOL said, on some projects, it can be rare to have anything more than really small clients.

@dwcaraway If your simple client-side app starts getting a lot of hits, we'd like to be able to know about it!

I still like the idea of time-limited demo keys. We can publicize them prominently, make them easy to grab, but then nag people to register once they've expired.

And once someone has an API key for api.data.gov, they have one for everything else which is hosted there. It's really not a big deal to register for one.

GUI commented 10 years ago

For custom rate limits on api.data.gov, they are indeed ORed together, so if any of the limits are exceeded, then the requester will be blocked for the appropriate duration (let me know if you have a desire for any different behavior).

Regarding this broader conversation, a couple other notes:

api.data.gov provides a universal "DEMO_KEY" that all our documentation examples use. This demo key acts like any other public API key, but it has significantly lower rate limits (it's subject to change, but right now it's rate limited by IP address to 30 requests per hour and 60 requests per day). The idea is that it gives interested developers enough so that they can play around with the examples on the documentation, but for any real usage, they'll likely run into the limits and need to signup for their own key. The DEMO_KEY can be used on all public api.data.gov APIs.

For client-side javascript apps, I'd like to come up with a more formalized approach to this, but at least here at NREL, we have used our API keys in client side apps. The only thing we do differently is swap those keys to be rate limited by IP instead of by the overall key. This basically means that each end-user of the client-side app is being rate limited by their own IP, so even if our apps get a lot of traffic, it won't bring down the app down for all users. It will only stop working if a single user/IP manages to generate enough hourly traffic themselves to hit our rate limits. This approach seems to have worked well for us, and from an analytics perspective it still allows us to easily segment the traffic from each one of our client-side applications. This obviously means that we do have an API key floating in the public that could be lifted by someone, but since these api keys are easy to get and only give you access to public data, we haven't seen that occur (signing up seems easier than lifting the key from your javascript debugger).

If you wanted to create this type of api key that's rate limited by IP and might be more suited to client-side apps, only api.data.gov admins can do that right now (I'm happy to help if you have any questions). However, as I mentioned, this is something I'd like to better formalize and make self-service for api.data.gov end-users if there's demand. I think Google's API console and their Simple API Access approach provides a pretty nice model for how to expose this. If you haven't used their system before, when you create your keys, you simply choose whether you're creating a server key or a browser key. For browser keys, the one extra thing they have you enter are valid HTTP referrers for your API key. Since HTTP referrers can be spoofed, this isn't exactly foolproof, but I think it does provide a nice bit of extra protection to prevent casual or accidental re-use of api keys in client-side javascript apps.

GUI commented 10 years ago

Oh, and just to clarify and ramble a bit more, API keys can be made completely optional in api.data.gov, but that's up to the agency API provider and how they want to expose their API.

For just my own personal opinion, I also slightly favor requiring API keys (as long as they're easy to get), since we've similarly seen quite a bit of value in gathering the extra bit of analytics we get from requiring that. At the same time, we've also seen value in providing something like our DEMO_KEY, where developers can do some very initial experiments with the API before needing to signup for their own key. But I also see some of the benefits of not requiring any keys at all, so that's at least why api.data.gov makes that configurable for each API owner.

@arowla I'm hoping the DEMO_KEY approach might be useful to you and alleviate some of the issues raised here. However, if the fact that this global DEMO_KEY exists for api.data.gov comes as a surprise and this is not what you had wanted for your fbopen api, I'm happy to chat more.

arowla commented 10 years ago

I think this issue is solved with better promotion of DEMO_KEY. It has already been added to the Apiary docs in a20a329, and we'll keep an eye out for other apropos locations to plug it.

18F / fbopen

Don't require an API Key #29