Disallow reserved chars on id fields

nearform / udaru

Open source Access Manager for node.js

https://nearform.github.io/udaru

MIT License

124 stars 19 forks source link

Disallow reserved chars on id fields #529

Closed cianfoley-nearform closed 6 years ago

cianfoley-nearform commented 6 years ago

I've enforced all ids to take the format [A-Za-z0-9-]+ enforced in both database (to enforce for imports) and interface

It was problematic with ltree column as that only allows [A-Za-z0-9_]+

Our autogenerated ids returned back uuids with - and so that lead to inconsistencies between entities (teams autogenerted ids were replacing - with _ before insertion)

Rather than having inconsistencies i've enforced all ids to one format, and before insertion it translates - to _ for ltree paths, and when returning paths to interface translates back to - (done in mapping)

The reason I did not allow both characters on input is because ids could clash if we convert all - to and also allow on input

This makes for nice looking urls and protects from sql injection (a lot of sql injection tests were nullified with these input constraints)

mihaidma commented 6 years ago

Regarding the ids that i saw used in solutions: string ids, Auth0 has ids in the form auth0|5ad9b4708036f90f3cf2929f, Okta has string ids, there are also UUIDs in the form 9c5cbc3a-f7e9-4400-7f90-5d47e9a27a8e

mihaidma commented 6 years ago

@cianfoley-nearform so i spot one issue with the Auth0 ids that contain a | I would go with the minimal needed restrictions on characters

cianfoley-nearform commented 6 years ago

Fair points guys, but we restrict team IDs already to uuid with _ instead of dashes (nothing else can go in there because of the way paths work.

If we have a restriction on team ids why not on other entities? should they not be consistent?

cianfoley-nearform commented 6 years ago

See david's comment on best practice here: https://github.com/nearform/udaru/issues/523

Here's the link https://www.owasp.org/index.php/Input_Validation_Cheat_Sheet#Whitelisting_vs_blacklisting

based on this we should look to see what characters we should allow... i've often seen {} around uuids, and I in oath are 2 candidates, bear in mind that we won't be able to use these on team_ids

So, the question is do we we allow a free for all (minus url reserved characters that would affect the route), or do we whitelist and then have inconsistency between teams and other entities?

If there is a link to id in an external system, the metadata field could be used for the external id, or the external system could maintain a mapping... we currently don't store metadata on policies however, which I restricted in this PR

mihaidma commented 6 years ago

I would go with accepting all chars excepting the ones that are restricted by html or ltree but would not expand the limitations just for consistency reasons:

on teams limit to ltree allowed characters
on others limit to characters with no html meaning

cianfoley-nearform commented 6 years ago

ok if we go with whitelisting, which is the correct approach, we need to define what is allowed, it could be a long list, unless we stick with the RFCs unreserved characters

From RFC: http://www.ietf.org/rfc/rfc3986.txt

2.3.  Unreserved Characters

   Characters that are allowed in a URI but do not have a reserved
   purpose are called unreserved.  These include uppercase and lowercase
   letters, decimal digits, hyphen, period, underscore, and tilde.

      unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

except on team we only allow _

this does not allow for '|', google does recommend having pipe

do we want to allow others that require encoding?

coveralls commented 6 years ago

Coverage increased (+0.2%) to 93.174% when pulling 4f4b0336a65998fec3a20050de9ce87fc4c59250 on disallow-reserved-chars-on-id-fields into 28414a3067a2534f03c1fd4b3641e53f95b294cb on master.