cognoma / core-service

Cognoma Core API
Other
9 stars 12 forks source link

Authentication? #1

Closed dcgoss closed 8 years ago

dcgoss commented 8 years ago

Do we need auth? If so should we use sessions, tokens, JWT, or a combo? The type will depend on what platforms cognoma can be accessed from (browser, app, etc).

awm33 commented 8 years ago

JWT can be used as token, though there is the argument they are useless in an environment that requires revocation. They are cool though, particularly with public/private key pairs. I'd avoid server side session storage if you can.

Two models I've used:

Authentication could be done with usernames and passwords stored in the app backend, single sign-on (Google, SAML, PennKey), or a combo.

dcgoss commented 8 years ago

My app uses JWT. The expiry time is very short, on the order of minutes. As the app is being used, the client transparently refreshes the token before its expiry date. This way when the user is done using the app the token expires and is simply not refreshed. However, this is only possible if you can store auth info client side.

dcgoss commented 8 years ago

Oops that was an accident, stupid touchscreens

awm33 commented 8 years ago

The biggest advantage of JWT is hitting the DB less, since it can be authenticated using a signature and you can store things like user id, name, email. I used them for an internal app I wrote, but the main app I work on now checks the token table and joins it to the user table each time. It does this for tens of thousands of request a second without needing a heavy SQL server. It's also OAuth based, password grant for our web app, and authentication_code/client_credentials for API users.

cgreene commented 8 years ago

From my experience with these types of servers, most people will not register to use them (at least the first time). I think we need to design with unregistered users in mind if we want to have strong uptake. If people have a very solid experience with the unregistered version they may register for convenience features (saved analysis, etc) in the future.

awm33 commented 8 years ago

@cgreene How to you keep abuse from happening? Someone or a bot could just pound the API, particularly if expensive distributed backend processing is involved. Or they could just keep writing rows to the db.

IP based rate limiting could help alleviate some of the above, though I still think wide open APIs that can cause write are asking for trouble.

Single sign on could help maintain conversion. Maybe anonymous users could browse previous stored results and/or a pre-calculated demo. Actually, a guided demo could be cool, a user could then use a predefined input that we have cached results for.

cgreene commented 8 years ago

I'd support limiting the computing resources available to the pool of all unregistered users with a tighter limit on each IP. For the API we can throttle requests by IP if we're worried about writes, though most of the endpoints may end up as read only. We should probably start to nail down detail on the endpoints. I'll try to create an issue for each of those.

awm33 commented 8 years ago

@cgreene I'm less concerned with genuine anonymous users than someone (or thing) hacking at the API. Sounds like that could also be a way to incentive anonymous users and provide some security, limiting the number of data processing jobs a user can submit. Maybe "wanna run more than X a day? register here"

Some public APIs offer a small number of request to unauthenticated users, than increases that for users using a token.

For registered user auth and API, I would recommend implementing OAuth2, it also looks like that REST library helps you out there.

cgreene commented 8 years ago

@awm33: +1 - great point & solution!

dcgoss commented 8 years ago

Somehow I feel like if the service is good and doesn't cost money, researchers won't mind signing up especially if we have single sign on. I love the idea of having a few guided and interactive tutorials that simultaneously shows what Cognoma can do while teaching the user how to use it.

dcgoss commented 8 years ago

Making every user sign in not only limits bots, but also gives us some valuable data about our users that we can use to improve the service. Having their email also gives us a way to contact them for feedback.

dhimmel commented 8 years ago

Making every user sign in not only limits bots, but also gives us some valuable data about our users that we can use to improve the service.

In our field there is a strong aversion to sign-in-required services. First, lot's of users are deterred by a sign in, because at first they're just trying the service to explore. Also, registration is viewed as a potential barrier to openness -- I've experienced instances where this was the case (example). The stigma in the field against mandatory registration is so strong, that some journals have policies such as:

Databases must be freely available to all via the web without the need to register or login.

Regarding,

Having their email also gives us a way to contact them for feedback.

Spam... one reason why scientists don't like making accounts (:

Making every user sign in not only limits bots

I think some basic protections will do the trick -- I don't envision anyone having an incentive to disrupt our services. Let's not erect any usage barriers prematurely. If abuse does become an issue, we could always add a captcha or I'm not a robot button for submitting a query?

dcgoss commented 8 years ago

Hey, you know the field much better than I do. If the consensus is no registration required than I'm all for it.

awm33 commented 8 years ago

@dhimmel Do people in the field have aversion to cookies? That could be used for some usage reporting, and if the person eventually wants to associate their query history and other stored info with a user registration.

cgreene commented 8 years ago

I haven't seen a cookie aversion as long as it is relatively low friction to convert it and we make it clear when things disappear (you clear cookies, switch browsers, etc.). @rzelayafavila (Rene) has done this for our Tribe webserver and gotten positive feedback.

dhimmel commented 8 years ago

@awm33 No, I don't know of any aversion to cookies. Lot's of European sites put up cookie banners due to EU legislation. I feel sorry for the European scientists who have to interrupt their user experience with cookie notices, but this shouldn't be an issue for Project Cognoma.

I think there could be several ways to associate queries to users. When an unregistered user runs a query, they could be presented with:

Once an account is created, queries could be retroactively associated with the account either

  1. automatically using cookies and the email address or
  2. manually using permalinks.

This is just one possibility. If we can think of an innovative solution that entirely removes the need for account creation and login, that would also be cool.

dcgoss commented 8 years ago

What if we gave each unique user who accessed the site a permalink as their authentication? They copy the permalink and keep it, and then whenever they want to use Cognoma they access the service via their own unique link? You can see this in practice on the social website nightchamber.com. Visit nightchamber.com, scroll to the bottom and it will show you your permalink. For example, to access your feed on nightchamber.com you would visit something like http://nightchamber.com/f/55056a1f-6f2f-4da3-93dc-a242e3079108

dhimmel commented 8 years ago

What if we gave each unique user who accessed the site a permalink as their authentication?

I like that idea a lot. Especially since you could easily add an "email me my permalink" option for users who want to use email to backup their login. The only issue I foresee is people may accumulate many accounts and become frustrated if there's no automatic/easy way to merge the accounts.

awm33 commented 8 years ago

I think @dcgoss 's idea could work. Maybe we could make it an optional root for "logged in" users, like /users/55056a1f-6f2f-4da3-93dc-a242e3079108/....

So /query/3245 for people just browsing and /users/55056a1f-6f2f-4da3-93dc-a242e3079108/query/3245for "logged in" users? Could also just just generate a uuid every time for every visitor.

We could look for a cookie containing the uuid before generating a new one, that way if the user doesn't note or bookmark the uuid URL, they can still just type in the website domain. Maybe redirect instantly to the uuid URL.

Could also allow for a user to be associated with multiple uuids/permalinks. Then provide a link dialog / UI.

awm33 commented 8 years ago

Implemented in https://github.com/cognoma/core-service/pull/25