HASEL-UZH / PersonalAnalytics

Personal Analytics project to increase knowledge workers' awareness about work and productivity.
https://hasel.dev/project/individual-productivity/
Other
56 stars 18 forks source link

Test and create Guide for Reading encrypted sqlite file in Python #299

Open casaout opened 8 months ago

casaout commented 8 months ago

Note that work in progress is described below and in this branch.

casaout commented 8 months ago

As briefly discussed yesterday, we think that it's better to not work with encrypted sqlite-databases to avoid making it too complicated for other researchers for their data analysis. Hence, we decided to create an encrypted zip-file.

I've started testing how reading an encrypted zip-file with a non-encrypted sqlite database works in Python, but am experiencing character encoding issues as summarized below. For time reasons, I'd like to ask you @SRichner to try and see if you can make it work.

What I tried:

Thoughts:

import sqlite3
import pyzipper
import io

subject = "32XSB1"
zipPath = "C:\\TEMP\\PA\\PA_32XSB1.encrypted.zip"
pwd = "PersonalAnalytics_32XSB1"
dbfile = "database.sqlite"
tableName = "user_input"

charset = "iso-8859-1" # error: 'near "SQLite": syntax error'
# charset = "utf-8" # error: 'utf-8' codec can't decode byte 0xe7 in position 27: invalid continuation byte

with pyzipper.AESZipFile(zipPath) as zf:
    # try 1: extracting then opening the file (in explorer) works
    # zf.extractall(path='C:\\TEMP\\PA\\PA_32XSB1_decrypted', pwd = bytes(pwd, 'utf-8'))
    # print(zf.infolist())

    # try 2: read log/txt file from encrypted zip file works

    # try 3: read sqlite file from encrypted zip file results in charset issues
    for file_info in zf.infolist():
            print(file_info)
            if file_info.filename == dbfile:
                with zf.open(file_info, pwd=bytes(pwd, charset)) as sqlite_file:

                    sqlite_data = sqlite_file.read()
                    conn = sqlite3.connect(':memory:')
                    cursor = conn.cursor()
                    sqlite_data_str = sqlite_data.decode(charset)

                    # print(sqlite_data_str) # printing the string actually shows decrypted content

                    # bug: 'near "SQLite": syntax error'
                    cursor.executescript(sqlite_data_str) # conn.executescript(sqlite_data_str) # same issue

                    # Query to list all tables in the database
                    cursor.execute("SELECT name FROM external.sqlite_master WHERE type='table';")

                    # Fetch and print the table names
                    tables = cursor.fetchall()
                    for table in tables:
                        print(f"Table: {table[0]}")
                        cursor.execute(f"PRAGMA table_info(external.{table[0]});")
                        columns = cursor.fetchall()
                        print("Columns:")
                        for column in columns:
                            print(column)

                    conn.close()