PolicyEngine / policyengine-api

PolicyEngine's REST API for computing policy impacts.
GNU Affero General Public License v3.0
7 stars 19 forks source link

Replace `WIDOW` with `SURVIVING_SPOUSE` in policy database #1400

Open MaxGhenis opened 2 months ago

MaxGhenis commented 2 months ago

To fix https://github.com/PolicyEngine/policyengine-app/issues/1599

MaxGhenis commented 2 months ago

Just hit this again, can someone take a look?

abhcs commented 1 month ago

Running the following script with the correct database user and password can fix the problem:

from copy import deepcopy
from google.cloud.sql.connector import Connector
import sqlalchemy # assuming 2.0, code is written in "commit as you go" style
import json

# initialize Connector object
connector = Connector()

# function to return the database connection
def getconn():
    conn = connector.connect(
        "policyengine-api:us-central1:policyengine-api-data",
        "pymysql",
        user="my-user",
        password="my-password",
        db="policyengine",
    )
    return conn

# create connection pool
pool = sqlalchemy.create_engine(
    "mysql+pymysql://",
    creator=getconn,
)

with pool.connect() as db_conn:
    # query database
    result = db_conn.execute(
        sqlalchemy.text("SELECT * from policy WHERE country_id = us")
    ).fetchall()

    # scan the results for .WIDOW key in policy_json and replace with .SURVIVING_SPOUSE
    for row in result:
        policy = dict(row)
        policy_json = json.loads(policy["policy_json"])
        modification_needed = any(".WIDOW" in key for key in policy_json.keys())
        if modification_needed:
            # create copy of policy_json but with new keys
            new_json = deepcopy(policy_json)
            for key in policy_json.keys():
                if ".WIDOW" in key:
                    new_key = key.replace(".WIDOW", ".SURVIVING_SPOUSE")
                    new_json[new_key] = policy_json[key]
                    del new_json[key]

            # update statement
            update_stmt = sqlalchemy.text(
                "UPDATE policy SET policy_json=:new_json WHERE policy_hash=:policy_hash",
            )
            db_conn.execute(
                update_stmt,
                parameters={
                    "new_json": json.dumps(new_json),
                    "policy_hash": policy["policy_hash"],
                },
            )

    # uncomment to commit transaction
    # db_conn.commit()

connector.close()

This is an untested script, so please use it cautiously.

While writing it I realized that a policy row in the database also contains an api_version. So while running the above script is expected to modify the keys in policy_json correctly, it would put the row in an inconsistent state because the key .SURVIVING_SPOUSE did not exist for the api_version for the problematic policies. Perhaps we should also update the api_version here to the version where .SURVIVING_SPOUSE was introduced, or maybe use a different solution?