rwnx / pynonymizer

A universal tool for translating sensitive production database dumps into anonymized copies.
https://pypi.org/project/pynonymizer/
MIT License
102 stars 38 forks source link

Close batch_processor stdin stream permanently when dump restoration is finished #87

Closed dosera closed 2 years ago

dosera commented 2 years ago

Description

I am running pynonymizer programmatically and noticed that after each run there is a dangling MySQL connection (same is expected for PostgreSQL):

mysql -h database-host -P 3306 -u user -px xx --ssl my_db

This piles up quite quickly.

Fix

I narrowed it down to restore_database in pynonymizer/database/mysql/__init__py where a batch_processor is created via subprocess.Popen(..).stdin but never closed.

Setup

rwnx commented 2 years ago

Hi @dosera ! Thanks for your contribution ✨

Would you mind adding an entry to the CHANGELOG.md under the [Unreleased] section describing the bug you've fixed? This would help a lot.

Just for my curiosity: presumably this is more important when you're calling pynonymizer programmatically? Or does it also happen when you're using the cli?

dosera commented 2 years ago

Hi @jerometwell ,

Would you mind adding an entry to the CHANGELOG.md under the [Unreleased] section describing the bug you've fixed? This would help a lot.

Done

Just for my curiosity: presumably this is more important when you're calling pynonymizer programmatically? Or does it also happen when you're using the cli?

I have no troubles with this when running via cli - my best guess (yet I am not too familiar with POpen in python) is, that when the python process stops any (sub)process opened by it is closed as well. I have a continuously running python module that calls the pynonymizer.run() repeatedly based on some cron schedule. Without this patch every call left one connection to the mysql database running.

rwnx commented 2 years ago

That makes perfect sense! Thanks for explaining - i don't get a lot of insight into how people are using it 😇

I'll merge this now and then it should go as a patch release over the next few days. Roll on v1.21.3!