KristerV / heliumpay-budgetweb-backend

2 stars 2 forks source link

Users #10

Closed jschr closed 7 years ago

jschr commented 7 years ago

Here's the scaffold for database migrations, the user model and api endpoints. Also added scripts for creating a migration, running them and rolling back.

You can test it out by checking out this PR locally with:

git fetch origin pull/10/head:feat/users
git checkout feat/users

Commands npm run db:migrations:make [name] Creates a new migration file under database/migrations

docker-compose exec api npm run db:migrations:run Runs migrations inside the docker container (run docker-compose up first)

docker-compose exec api npm run db:migrations:rollback Rollback the latest migration from within the docker container

npm run test -- --verbose Runs the test suite (run docker-compose up first)

Authentication Scopes Auth scopes are used to allow JWTs to perform a subset of actions against the api. The set of scopes can be found in scopes.js.

user:* Allows access the auth token to user create, get and update (created by logging in)

user:resetPassword Allows the auth token to reset the users password (created by reset password workflow)

user:confirmEmail Allows the auth token to confirm the email address (created by email confirmation workflow). Added this so we don't have to send the longer-lived user:* via email token and this doesn't require that the user be logged in to confirm the email.

Tests Tests are run against a separate test database with the ava test framework. Make sure you run docker-compose up first.

To ORM or not to ORM For simpler projects I've gotten away without really needing one but we may want to look at using something like Bookshelf.

Tasks

KristerV commented 7 years ago

The migrations approach is awesome. I do think it's kind of overkill for this fairly small project, but since they're here now it's clearly going to make life easier.

I haven't used migrations before so I'm curious how it works.

  1. Is the migrations table keeping track of which migration file has been used and which not?
  2. Is npm run db:migration:run [filename] basically just to create an empty migrations file?
  3. Does run migrations automatically know which files it hasn't migrated yet? (basically same question as 1)
  4. Is migrating automatic or do we need to set up a grunt type process for it?
  5. Rollback doesn't have a [name] so does it just rollback the last change? And if we need to rollback 3 migration files we need to run it 3 times?
  6. Couldn't fully understand what withTransaction() is for.
  7. Did you write all of this manually or was there scaffolding tool involved?

So the whole thing seems legit and since I haven't used it myself I can't be very critical anyway. I read through it all (multiple times) and I see no problems. Here's a few small tips anyway:

  1. Use moment.js for date formatting (it's already installed). In yyyymmddhhmmss() for example. Don't care if you leave it for now, just know there's an easier way :)
  2. Write a short 1-2-3 on how to use migrations when a new dev comes in.
jschr commented 7 years ago

The migration scripts / strategy is just a small wrapper on top of knex's built-in migration tools: http://knexjs.org/#Migrations. The migration make command just spits out a migration template that's compatible with prettier and used our getDbDriver function instead of raw knex.

  1. The table just keeps track of which migrations have been run against the database. Knex will diff the migrations directory (database/migrations) against the table to know which migration have not been run. Storing them in the database is useful if we ever have multiple webservers in the future.

  2. Yep, just a convenience for creating a new one.

  3. Yeah. This is being handled by http://knexjs.org/#Migrations-latest.

  4. Migrating isn't automatic at the moment but we'll probably want to add it to the deploy process (maybe just some script se can setup). Running npm run db:migrations:run against an already migrated database is safe and is a no-op.

  5. Yeah by default it's just the last change. I'm 90% sure you can rollback a specified number of steps I just can't find the docs on that yet. Will get back to you.

  6. Sorry, should have documented this (will add). It's not being used at the moment but its there when you need to create / destroy within a db transaction: http://knexjs.org/#Transactions. withTransaction appends to the current transaction context if provided, otherwise returns the regular query context. Example of creating a transaction would be something like:

    const db = await getDbDriver()
    // deleting a proposal and user is within a transaction so if 
    // User.destroy fails the proposal destroy will be rolled back.
    await db.transaction(async (trx) => {
    await Proposal.destroy(proposalId, trx)
    await User.destroy(userId, trx)
    })
  7. Manually. Using something like Bookshelfjs would reduce a bit of the model boilerplate but may be overkill for now.

  8. Will definitely switch to using moment, good catch. [done]

  9. Will add this to the README before merge.

    The benefit to the migration approach is when we add tests we can recreate the database easily for each run. This could just be accomplished with a script but migration files can be used in production for future changes. Storing the migration state in the db is useful for accountability and if you ever have multiple servers connecting to the db.

jschr commented 7 years ago

@KristerV curious on your thoughts between these two variations of the api:

Option 1 /users/:id where id is independent of username. Currently this is the DB id (auto-increment).

Option 2 /users/:username

Option 1 would let users change their username (although currently I am preventing this) and is slightly more rest-like. However this does expose the db id (which we could hide with hashids). For other resources we add I imagine we would be using the db ids so this would need to address the exposed ids anyways and also be consistent with the rest of the api.

Option 2 would be more convenient for api consumers since you would only need to know the username in order to fetch / update. If we want to allow changing the username in the future, this would also change the api endpoints for updating that user which could be awkward from the consumer's point of view.

... Option 3?

KristerV commented 7 years ago

Answers

  1. I think hashId for usernames is the best option.
  2. Squashing is a great idea :D Merging all individual commits isn't ideal, but just one commit isn't either. So unless you're willing to create a commit for each topic let's just normal merge.

Questions

  1. What is JWT, why do we need it and why is the first commenter here so against it?
  2. We probably need some rate limiter to protect from DDOS or just API spammers?
  3. I see you bcrypt() in users/update, but that should really be done on client side? Or both?

Other comments

  1. Awesome readme :)
  2. Awesome code in general. You've done loads, it's quite incredible really.

Ok time's up for me today, got half way of looking through it. Gonna continue tomorrow.

jschr commented 7 years ago

Thanks for the kind words, I've been learning lots so it's been fun being able to contribute :)


JWTs JWTs are immutable, encoded json tokens that can contain non-sensitive information such as date issued, expiration time, owner and other claims. Claims are visible by anyone who has the token but signed so they cannot be modified. They are typically used as a way to provide stateless authentication.

Currently a JWT is issued when a user logs in (which expires in 10 hours) and for the confirm email and password reset workflow (which expire in 5 minutes). I'm using a custom scope claim that makes sure the confirm email and password reset tokens only have access to confirm email / password reset api endpoints respectively.

I think the main argument that I'm reading against JWTs is you can accomplish the same thing without JWTs. That's true but I chose to avoid creating a tokens table in the db and just leverage the built in functionality JWT for expiry. My initial thought was that this was a simpler approach than creating a new table and maintaining our own auth logic.

From a security standpoint I don't think one approach is more secure than the other and any security issues would likely be in the application logic rather the underlying spec itself. I believe at one point when using the default node json web token lib, if you didn't specify which algorithm to use it would allow users to bypass the auth with none as the algorithm. You can see that I'm explicitly setting the algorithm here: https://github.com/KristerV/heliumpay-budgetweb-backend/pull/10/files#diff-e13573d824d3cc0b6ea6636bce8c776fR25

I'm not an expert by any means so if you feel more comfortable with another auth strategy I'm not opposed to changing! I think either way it might be good to get a security audit done (if you know anyone) before launch.

Rate limiter / DDOS Rate limiting seems easy enough to add with https://www.npmjs.com/package/express-rate-limit. I should add this but might make sense to do after merging this since it will likely touch all the endpoints.

DDOS protection as far as I know is an infrastructure layer because you would want to prevent your server from being hit. There are a few options out for services that offer this but they are usually expensive and I have not used one before so I can't recommend one personally. Probably worth investigating more.

Bcrypt Bcrypt is used to hash passwords so we are not storing them in plain text. This should happen at the server rather than the client. As long as we only hit the API with https-only we shouldn't need to worry about users password getting stolen.


Hope that all makes sense and keep the questions coming if you have them!

KristerV commented 7 years ago

Yeah I think I'm less of a reviewer and more of a student in this case too :D

JWT

So it's basically cookies in JSON form? I couldn't google it properly yesterday as my internet gets way worse in the evening. I'm reading up on it now and it seems instead of a cookie-hash that is validated by the server you have the server signature with the same purpose. Now instead of the server saying "ah you got this key, you're an admin" it's instead saying "ah that is my signature so whatever info you've got is valid" which I can see why may be a theoretical security concern. However it's true that the JSON is not mutable (only server can change) then it makes no difference.

This brings me to the next question. Is the reason for using this basically that you can have useful information right in the "cookie" (JSON)? So for example instead of sending a PHPSESSID=xxx and {userType: admin} we would just send the JWT with both inside basically?

Rate limiter / DDOS

I think a package would be fine (later) too. And CloudFlare we can get when the need arises.

Bcrypt

Okay. However could we still use a salt to protect from rainbow tables?

jschr commented 7 years ago

JWT Yeah this is my understanding as well. The JWT avoids needing to persist user roles and expiration in the database somewhere. While a cookie can be used for the same thing, its more of the delivery mechanism since you could decide to store the JWT in a cookie as well. Instead of using a cookie to send the JWT we are using the Authorization: Bearer <JWT> header pattern (example).

Rate limiter / DDOS 👍

Bcrypt I'm under the impression that it is using a salt, according to the docs:

bcrypt.hash(myPlaintextPassword, saltRounds, function(err, hash) {
  // Store hash in your password DB.
});

Will auto-generate a salt: https://github.com/kelektiv/node.bcrypt.js#to-hash-a-password And here's where I'm doing it: https://github.com/KristerV/heliumpay-budgetweb-backend/pull/10/files#diff-a7bdf3a410ea3a94964ea84c60feb287R44

Does that seem right to you?

KristerV commented 7 years ago

hmm, so where is the salt taken from then or if it's generated where is it saved? On top of the hash maybe?

KristerV commented 7 years ago

ah I see. And the salt is saved within the hash itself. That's cool. Should up the cost factor to about 15 as per recommendation from that comment :D

jschr commented 7 years ago

Worth noting that the rounds are exponential (ie 2^10 = 1024, 2^16 = 65k). One commenter on that post said it took 5-10 seconds to compare a password with 16 which IMO is too long. I'll experiment with the number tonight and see how long it takes on my machine.

jschr commented 7 years ago

BTW only thing left is hashids and I'm close to being done that :D

KristerV commented 7 years ago

Yeah just after commenting I thought it probably is exponential, otherwise it would be 1000 by default :D