Closed cianfoley-nearform closed 6 years ago
Regarding the ids that i saw used in solutions: string ids, Auth0 has ids in the form auth0|5ad9b4708036f90f3cf2929f
, Okta has string ids, there are also UUIDs in the form 9c5cbc3a-f7e9-4400-7f90-5d47e9a27a8e
@cianfoley-nearform so i spot one issue with the Auth0 ids that contain a | I would go with the minimal needed restrictions on characters
Fair points guys, but we restrict team IDs already to uuid with _ instead of dashes (nothing else can go in there because of the way paths work.
If we have a restriction on team ids why not on other entities? should they not be consistent?
See david's comment on best practice here: https://github.com/nearform/udaru/issues/523
Here's the link https://www.owasp.org/index.php/Input_Validation_Cheat_Sheet#Whitelisting_vs_blacklisting
based on this we should look to see what characters we should allow... i've often seen {} around uuids, and I in oath are 2 candidates, bear in mind that we won't be able to use these on team_ids
So, the question is do we we allow a free for all (minus url reserved characters that would affect the route), or do we whitelist and then have inconsistency between teams and other entities?
If there is a link to id in an external system, the metadata field could be used for the external id, or the external system could maintain a mapping... we currently don't store metadata on policies however, which I restricted in this PR
I would go with accepting all chars excepting the ones that are restricted by html or ltree but would not expand the limitations just for consistency reasons:
ok if we go with whitelisting, which is the correct approach, we need to define what is allowed, it could be a long list, unless we stick with the RFCs unreserved characters
From RFC: http://www.ietf.org/rfc/rfc3986.txt
2.3. Unreserved Characters
Characters that are allowed in a URI but do not have a reserved
purpose are called unreserved. These include uppercase and lowercase
letters, decimal digits, hyphen, period, underscore, and tilde.
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
except on team we only allow _
this does not allow for '|', google does recommend having pipe
do we want to allow others that require encoding?
I've enforced all ids to take the format [A-Za-z0-9-]+ enforced in both database (to enforce for imports) and interface
It was problematic with ltree column as that only allows [A-Za-z0-9_]+
Our autogenerated ids returned back uuids with - and so that lead to inconsistencies between entities (teams autogenerted ids were replacing - with _ before insertion)
Rather than having inconsistencies i've enforced all ids to one format, and before insertion it translates - to _ for ltree paths, and when returning paths to interface translates back to - (done in mapping)
The reason I did not allow both characters on input is because ids could clash if we convert all - to and also allow on input
This makes for nice looking urls and protects from sql injection (a lot of sql injection tests were nullified with these input constraints)