mogilvie / EncryptBundle

Encryption bundle
82 stars 27 forks source link

Update doctrine entity changeset data onLoad, preFlush and postFlush. #34

Open achterin opened 1 year ago

achterin commented 1 year ago

When using this bundle in conjunction with DamienHarper/auditor-bundle one can observe that the entity changeset produced is plain wrong.

What I expect to see on insert: Old value: null New value: encrypted new value

Instead you get this: Old value: encrypted new value New value: decrypted new value

What I expect to see on change: Old value: encrypted old value New value: encrypted new value

Instead you get this: Old value: decrypted new value New value: encrypted new value

I've spend quite some time to figure out how this can be fixed and tracked it down to two problems: 1) onFlush is the wrong event to encrypt on change as the changeset is already created by doctrine with decrypted values. preFlush would be better as this allows to populate the entity-properties with the encrypted values and doctrine will create the changeset just fine. 2) onLoad needs to set the encrypted value as original data, not the decrypted one.

Right now I'm stuck sadly because it seems that MySQL and Postgres work differently. With MySQL I can rely on new entities not being in the doctrine entity-map, but the get inserted there when pre-flush is fired when using Postgres.

I think the current behavior is not desired as this will result in sensitive data being leaked as they get written to the audit tables in plaintext. On top of that the changes written to audit tables are plain wrong and are more or less useless.

I am looking forward to your opinion on this matter.

mogilvie commented 1 year ago

Hi @achterin, which version of the bundle are you using? There was a merge done a few days ago into master which changed the events, or do I need to look at the taged versions?

I'll have a look at the auditor bundle. We don't use that ourselves. and see if that help me replicate your analysis.

achterin commented 1 year ago

Hi @mogilvie , thanks for your response. I was using last release tag as well as latest master for my test. I'm looking forward to your feedback on this matter.

mogilvie commented 1 year ago

Hi @achterin

Doctrine calculates it's change set by comparing the flushed entities in the entityMap against the originalEntityData. If we leave, or set, the originalEntityData to the encrypted value, then it always triggers a change state because the decrypted value is always different from the original encrypted value.

We cannot simply re-encrypt the value during preFush, because the encryption process is not repeatable. Re-encrypting the same value creates a different encrypted value, even if the decrypted data is exactly the same. This is an desired outcome with the encryption process.

The above two suggestions would result in every encrypted property being re-persisted for all entities, along with unnecessary re-encryption. This imposes additional workload on the persistance process.

So the reason that the onLoad subscriber sets the original data as the decrypted value, is to avoid triggering a change event in the unitOfWork.

However, you could achieve your desired outcome by creating your own DoctrineEncryptSubscriber, and implementing an array of initial onLoad decrypted properties and values. Later, during onPreFlush compare the entitySet against the onLoad decrypted array of properties. If the decrypted values are the same, then remove the property from the changeSet.

This method also caries an overhead with it, depending on how many encrypted properties you might have. I don't want to impose this overhead on every user of the bundle.

The bundle already allows you to define your own DoctrineEncryptsubscriber in your Config\Packages directory.

spec_shaper_encrypt:
  subscriber_class: App\EventSubscriber\YourDoctrineEncryptSubscriber

If you do find a low overhead way of achieving the desired outcome then please submit a pull request.

mogilvie commented 1 year ago

I'm still interested in doing this. So I might try and measure what that overhead might be, and test if it's significant.