Open thislg opened 7 months ago
@thislg
Suggestions:
on_invalid: abort # abort|ignore
on_invalid
setting be set on copy
? So the copy_condition: valid IS TRUE
setting on copy is redundantHello, is this piece of code from the documentation? Because there is no indication that we can produce a custom configuration especially with "post_load".
Hello, is this piece of code from the documentation? Because there is no indication that we can produce a custom configuration especially with "post_load".
load
and copy
options are documented (see https://github.com/le-phare/import-bundle/blob/master/docs/configure/load.md and https://github.com/le-phare/import-bundle/blob/master/docs/configure/copy.md). You can't add arbitrary options so the post_load option does not exist but in this issue I suggest adding it so we can add validation constraints.
A lot of the time we have to add custom code to validate loaded data before copying it. A common use case is to ignore duplicate lines but still continue the import.
It could be set like this:
A subscriber on
ImportEvents::POST_LOAD
would then execute an UPDATE on temporary table to set the "valid" field tofalse
on failing rows. In case of validation error, when on_invalid is set to "ignore", it would add logs "Unique code validation constraint failed. Skipping duplicate my_resource_name (code: 12345) at lines 4, 5, 6" and import would continue without copying invalid lines. If on_invalid is set to "abort", it would stop the import without copying the data.Other validation constraints could be added, like format validation (regex), etc. A simpler option would be to skip the validation config, instead adding an option to run an arbitrary SQL query on post_load to set the "valid" flag.