Open peterjaap opened 3 years ago
The way I tackled this issue before issuing #61 was to code a simple app that would check MySQL database for the tables list and then search base configuration directory for YAMLs that mentioned those tables. Needed YAMLs were then bundled into a separate 'final' directory and Masquerade was run based on that final configuration.
This resolved errors that occured when table didn't exist, as I were only using Masquerade against tables I was sure that existed. A downside to that - I had to separate custom config YAMLS into YAML per table, as it was possible that not every table in a group exists - i.e. email_table1
and email_table2
have configuration inside one YAML (email.yaml), but only _emailtable1 exists in the database - Masquerade would still try to run anonymization for non-existing email_table2
.
As of the way to share these YAMLs, they should be easily accessible and maintained. Storing them inside a separate repo may introduce more management issues - how to retrieve it for usage easily? how to let app itself obtain them for usage?
It would be nice if app could provide default configuration YAMLs on demand for users to choose which ones they want to use, or even provide only those configs that are suitable for user's database. This would both let configs be stored in app's repo for maintaining and be built into application, while letting users to either run as-is or customize the way anonymization process will run.
Those are just my thoughts though, and I'm not a PHP dev myself - let me know if they make any sense, or if some implementation issues would be a problem.
I have some ideas! I have two criteria for this:
I suggest one of these:
.masquerade/config/
or similarvendor/*/*/composer.json
files which match certain criteriagit submodule add git@github.com:INITECH/MASQUERADE_MODULE.git .masquerade/config/initech
wget -O - https://github.com/xxxxx/xxxxx/releases/xxxx.zip |unzip - .masquerade/config/
Maybe there's even another option - the masquerade phar file could 'require' the vendor/autoload.php
from the current folder, and scan it for classes - but believe me scanning all possible classes causes various problems and requires composer dumpautoload --optimize
which isn't the default.
I like the first one - simple and can be used with "require-dev" to ensure unnecessary modules don't go into production environments.
@johnorourke
We could introduce a --strict-mode
flag to throw an exception on missing configs / missing tables / missing columns. Seems easy enough.
I'd be in favor of the composer repo as well. I'd suggest elgentos/masquerade-configs
. Then we could add a console command to this repo that can be run with composer's post-install-cmd
(when the config package is present) to ask which files should be copied from that repository. It could then create a .masquerade-installed
file to make sure this isn't run automatically on each install (and assume it is when --no-interaction
is passed).
That's a great idea @peterjaap - the single repo would keep them all tidy, easy to fork, allow management of PRs and issues etc, and the post-install-cmd hook would make it really simple to use.
It would need to know the 'platform' config folder the user wants them in - in masquerade core there are several config file locations - perhaps auto-detect to see if any are in use, and/or let the user choose that too?
We'd need to consider updates too - eg. you run and install it, but then later an update to one of the vendor-specific files is released in the composer module - perhaps just warn the user during the post-update-cmd hook if they might be running out of date files?
Trying to move this to Discussions but can't find the option? https://docs.github.com/en/discussions/managing-discussions-for-your-community/managing-discussions-in-your-repository#converting-issues-based-on-labels
Now that we've added a try/catch block we can add YAML definitions for tables that don't necessarily have to be present in an install.
In our projects, we include YAML files for all possible extensions we use. If a project doesn't have that table, it'll just skip it now.
I've put a few of those YAML files in the Wiki, see:
What would be the best way to share these? I don't think adding them to Masquerade itself would be wise, since that'll clutter stuff and maybe even introduce unexpected behavior. A separate repo maybe? Keep placing them in the wiki? Any other ideas?
cc @tdgroot @johnorourke @Tjitse-E @erikhansen