opentdf / platform

Persistent data centric security that extends owner control wherever data travels
BSD 3-Clause Clear License
18 stars 10 forks source link

Policy API: FQN lookup should be normalized to lower case to match create/update of policy objects #669

Closed jakedoublev closed 6 months ago

jakedoublev commented 6 months ago

Background

Some PDPs and PEPs may store FQNs out of sync with lowercase FQNs expected by the platform and stored in the DB. Since Create/Update of policy objects are normalized to lowercase, lookup by FQN should be normalized as well.

Acceptance Criteria

  1. FQN lookup is normalized to lowercase
  2. Proto comments document the behavior
  3. Integration test to prove this behavior is satisfied
pflynn-virtru commented 6 months ago

If the Attributes are in the form of URL, (specifically https://) then It would be nice to uphold the rules of URL casing.

IIRC scheme, domain name are case-insensitive. Path (attribute name and value) and query parameters are case sensitive

jakedoublev commented 6 months ago

If the Attributes are in the form of URL, (specifically https://) then It would be nice to uphold the rules of URL casing.

IIRC scheme, domain name are case-insensitive. Path (attribute name and value) and query parameters are case sensitive

Thanks for the callout @pflynn-virtru. Here is the relevant ADR about allowed characters & the documented decision which is enforced on create/update within the protos and normalized to lower case on save to the DB

jrschumacher commented 6 months ago

@pflynn-virtru I agree with your assessment of keeping with standardizations. This really calls out weather using a fqn was the ideal solution, but I believe that ship sailed almost a decade ago.

At this point I believe we're going to need to make some strong opinions which will conflict with RFC-3986. This might be similar to how Google's Gmail product diverges from email standards by aliasing all cases, period usage (jrschumacher = j.r.schumacher), and plus addition (jrschumacher+abc123) to the same email address.

We have found that from a developer-experience (DX) perspective, case-sensitivity is far too complex when

  1. the data access resolution is handled at a different time from data encryption, so it's very easy to create inaccessible data and not realizing it
  2. not always having access to the available attributes leads to deviation in implementation on client side

From experience, this led me to multiple wasted hours thinking the failure was within my application.