Closed staabm closed 8 months ago
Fixed by #110
@staabm I forgot to warn you about something
You mentioned creating a dump of a big database, and in another issue, you mentioned using the faker converter. In some situations it can have a noticeable impact on performance, because a lot of faker formatters are not optimized for being used billions of time.
This is why the default templates mainly use custom converters instead of faker.
However, it only starts being problematic when you're using it to convert at least hundred of millions of values.
thanks for the heads up. I am already in the perf optimizing process and looking for bottlenecks.
see e.g.
thanks for the heads up. I am already in the perf optimizing process and looking for bottlenecks.
see e.g.
* [PDO quote bottleneck php/php-src#13440](https://github.com/php/php-src/issues/13440) * [Inline escape() method to improve performance ifsnop/mysqldump-php#277](https://github.com/ifsnop/mysqldump-php/pull/277)
Nice!
FYI we switched to druidfi/mysqldump a while ago because the original repo was inactive, you might also need to create a PR on their repo.
noted, thanks ;-)
40-50% faster dump with https://github.com/druidfi/mysqldump-php/pull/37
I combined 3 performance related PRs to one for testing: https://github.com/druidfi/mysqldump-php/pull/38
@guvra There is also commands if you wanna test it with this library. My first test was good as time went down from 40s to 5s when creating a dump.
@back-2-95 I made a quick test on a medium size magento database.
So it looks pretty good :+1:
On another note, I used #113 to monitor the total execution time (this PR adds a progress bar to gdpr-dump when using the -v
option). The progress bar uses setInfoHook
from mysqldump-php to make the progress bar advance. And it doesn't behave like expected, because this hook is triggered after a table was dumped, not before. So when the progress bar displays a table name, the table was actually already dumped, whereas I would expect to see the name of the table that is currently being dumped.
Preconditions
GdprDump Version: 4.0.3
PHP Version: any
Database Version: any
Steps to reproduce
when running with a complex schema and a big dump the process can take hours. when unqiue-value rules are not properly configured "dump-time" errors can occur like:
the error message itself is not very helpful when a big config file is involved which contains several unique rules.
Expected result
a error message, which denotes which column in which table could not be uniquely randomized
Actual result