The current implementation of GDPR FORGET does not permit anonymized values of NULL (seemingly, for owner names). We need to either enable this, or create a different means of expressing an "anonymized value".
Leaving these here to remember potential clean ups:
[ ] Anonymization is done by using the UpdateContext API. Currently, we work around that starting its own transaction via a boolean function argument. UpdateContext still updates the dataflow on its own though.
[ ] We potentially anonymize records that actually end up being deleted (or were deleted) and then do another quick round of deletion at the end to ensure those are gone.
[ ] Perhaps the code path for owners and accessors can be combined (for forget) a little more.
A potential cleaner design:
[ ] All operations for FORGET including anonymization is carried out as native operations in the RocksdbSession API. This no longer uses UpdateContext, and thus no longer updates flows on its own (and we can get rid of boolean flag workaround).
[ ] Combine the records to be deleted and records to be anonymized. Apply anonymization first, and then delete everything that needs to be deleted in one shot (including if it was anonymized).
[ ] Single traverse by starting from root datasubject table, and going into dependencies, as is for accessors. Then, if we are along ownership edges, we can still use the quick DeleteShard API, while using a more manual approach (as is now) for accessors. Perhaps do the anonymization only first in that traversal, then do a separate loop to just blindly delete shard?
The current implementation of GDPR FORGET does not permit anonymized values of NULL (seemingly, for owner names). We need to either enable this, or create a different means of expressing an "anonymized value".