makinacorpus / DbToolsBundle

A PHP library to backup, restore and anonymize databases
https://dbtoolsbundle.readthedocs.io
MIT License
181 stars 15 forks source link

Filtering #138

Closed bertdk closed 5 months ago

bertdk commented 7 months ago

Is it possible to apply the anonymization to a specific set of entities? I would like to only anonymize entities which have a deletedAt timestamp greater or equal to now or filter them by specifying a list of ids. Use case: running a daily anonymize command on my production database to mask deleted entities

SimonMellerin commented 7 months ago

For now, it's not easy to do at all.

But what you ask for is a bit linked to what's asked in #136.

We'll definitively think about adding this kind of feature in the future.

maxhelias commented 5 months ago

I do something similar but to avoid anonymizing our user accounts. I extended the email anonymizer by simply adding these instructions after the anonymize method:

public function anonymize(Update $update): void
{
    parent::anonymize($update);

    $where = $update->getWhere();
    $where
        ->isNotLike($expr::column($this->columnName, $this->tableName), '%@mycompany.com')
        ->isNotLike($expr::column($this->columnName, $this->tableName), '%@myclient.com')
    ;
}

But these only apply to the column, not the entire row. In my case it's still ok.

I haven't done any more research, but a PHP Attribute on the entity could be added to apply a where on the row ? Something like :

#[Where(
   operator: WHERE::AND
    raw: [
        new Not(new Like('email', '%@mycompany.com', true)),
        new Not(new Like('email', '%@myclient.com', true)),
    ]
)]
pounard commented 5 months ago

Well, this tool wasn't design with this use case in mind.

There's a easy for us solution, that would require some code on your side, which is to add the Symfony event dispatcher injection in the anonymizator, create a few event classes (before update, after update, etc...), so you could alter the query at the right moment.

Although you should read my comment there https://github.com/makinacorpus/DbToolsBundle/issues/136#issuecomment-2072345062 which explains why this is not a feature right now.

pounard commented 5 months ago

@maxhelias we discussed it internally and I opened https://github.com/makinacorpus/DbToolsBundle/issues/171 for your own use case, it seems legit in the end.

@bertdk but regarding this original issue topic, altering production data using this API directly into your production database is not a use case we intend to support. May be that in the future some new features may help you in doing that. Nevertheless, it's a non goal for us.