techouse / mysql-to-sqlite3

Transfer data from MySQL to SQLite
https://techouse.github.io/mysql-to-sqlite3/
MIT License
217 stars 31 forks source link

Exclude tables from copying. #27

Closed AceScottie closed 2 years ago

AceScottie commented 2 years ago

Please add the ability to exclude tables from being downloaded (using phpmyadmin so there are a lot of pma_* tables)

The fix i implemented from pypi download so i can use:

    def __init__(self, **kwargs):
        #...
        self._exclude = kwargs.get("exclude") or []
        #...
    def transfer(self):
        """The primary and only method with which we transfer all the data."""
        filtered = [] # ensure there is a valid list object
        if len(self._exclude) > 0:
            self._mysql_cur_prepared.execute(
                """
                SELECT TABLE_NAME
                FROM information_schema.TABLES
                WHERE TABLE_SCHEMA = SCHEMA()
                {exclude_format}
            """.format(
                exclude_format = ('AND ({phrase})').format(phrase=' OR '.join(['TABLE_NAME LIKE "{e}"'.format(e="%{exclude}%".format(exclude=x)) for x in self._exclude])) # this line may need cleaning up a bit, it just formats the list into sql string.
                )
            ) # SELECT ... AND (TABLE LIKE "%pma%" OR TABLE_NAME LIKE "%admin%") 
            filtered = ("'%s'"%row[0] for row in self._mysql_cur_prepared.fetchall()) # returns list of exluded tables
        if len(self._mysql_tables) > 0:
            # transfer only specific tables

            self._mysql_cur_prepared.execute(
                """
                SELECT TABLE_NAME
                FROM information_schema.TABLES
                WHERE TABLE_SCHEMA = SCHEMA()
                AND TABLE_NAME IN ({placeholders})
                AND TABLE_NAME NOT IN ({filtered})
            """.format(
                    placeholders=', '.join(self._mysql_tables), # quick fix
                    filtered = ','.join(filtered) # added filter option 
                )
            )
            tables = (row[0] for row in self._mysql_cur_prepared.fetchall())
        else:
            # transfer all tables
            self._mysql_cur.execute(
                """
                SELECT TABLE_NAME
                FROM information_schema.TABLES
                WHERE TABLE_SCHEMA = SCHEMA()
                AND TABLE_NAME NOT IN ({filtered})
            """.format(filtered = ', '.join(filtered)) # added filter option 
            )
            tables = (row[0].decode() for row in self._mysql_cur.fetchall())

This solution can probably be merged with the existing SELECT TABLE_NAME statements, i made it separate to keep track of it.

Also: https://github.com/techouse/mysql-to-sqlite3/blob/8558f1c78cd27e99407da81e976e6b7666829c4e/mysql_to_sqlite3/transporter.py#L506 can be replace with placeholders=', '.join(self._mysql_tables) for a bit easier reading (included in example above as well).

techouse commented 2 years ago

Hi,

If I understand you correctly, you basically you want the reverse of the already existing option

-t, --mysql-tables TUPLE        Transfer only these specific tables (space
                                  separated table names). Implies --without-
                                  foreign-keys which inhibits the transfer of
                                  foreign keys.

Can't you just use this one?

Can't you simply remove the subset of tables from your MySQL database before transferring it to SQLite using some proper MySQL DB admin?

If the answer to both of these questions is no, then please submit a PR (with tests 🤓), however, using wildcards like pma_% here is very dangerous and I strongly advise against it.

AceScottie commented 2 years ago

In my DB I have quite a few tables and they may be added to or removed so using hardcoded mysql_tables would be inefficient (having to created new connections to find the list of tables). I have removed the wildcards from the query and left it up to user input, it can be removed completely if needed by changing 'TABLE_NAME LIKE' to 'TABLE_NAME =' (line 508 in PR)

This is a live DB and using phpmyadmin is the quickest way to make changes, so removing pma_% tables would break phpmyadmin.

I have created a PR per your suggestion but some code I cannot test (I haven't implemented foreign keys yet) so it will need a look over. I switched around the query a bit to format it so only 1 query is needed rather than 2. I am bad with formatting so please look through to make sure its ok :)

techouse commented 2 years ago

Hey,

Who says that they have to be hardcoded. You could simply list them all and exclude them using some smart filtering. Also, have you considered using some other tool except PHPMyAdmin? I know it's convenient but there are other more powerful free cross-platform desktop apps like DBeaver.

techouse commented 2 years ago

Version 1.4.13 now has the feature

-e, --exclude-mysql-tables TUPLE
                                  Transfer all tables except these specific
                                  tables (space separated table names).
                                  Implies --without-foreign-keys which
                                  inhibits the transfer of foreign keys. Can
                                  not be used together with --mysql-tables.