Closed sergiogcx closed 2 years ago
This is the query I've been using to find the foreign keys:
select
pcon.connamespace,
pcon.conname AS constraint_name,
c.relname as child_table,
p.relname as parent_table,
(
CASE
WHEN pcon.confdeltype = 'a' THEN 'NO ACTION'
WHEN pcon.confdeltype = 'r' THEN 'RESTRICT'
WHEN pcon.confdeltype = 'c' THEN 'CASCADE'
WHEN pcon.confdeltype = 'n' THEN 'SET NULL'
WHEN pcon.confdeltype = 'd' THEN 'SET DEFAULT'
ELSE 'n/a'
END
) as const_delete_type,
(
CASE
WHEN pcon.confupdtype = 'a' THEN 'NO ACTION'
WHEN pcon.confupdtype = 'r' THEN 'RESTRICT'
WHEN pcon.confupdtype = 'c' THEN 'CASCADE'
WHEN pcon.confupdtype = 'n' THEN 'SET NULL'
WHEN pcon.confupdtype = 'd' THEN 'SET DEFAULT'
ELSE 'n/a'
END
) as const_update_type
from pg_constraint as pcon
join pg_class c on c.oid=pcon.conrelid
join pg_class p on p.oid=pcon.confrelid
WHERE conname LIKE 'moped_%'
@zappfyllc - please set up a time to align database approaches with @sergiogcx and @mateoclarke to discuss this and #7036 ...
@zappfyllc will provide a list of unnecessary tables we can delete (I'll check prod, staging, and schema)
We will move forward with foreign keys that are necessary on update CASCADE on delete CASCADE
@sergiogcx Confirmed via prod/staging/SchemaSpy and my initial pass on creating atd_moped
, we should be safe to delete the following tables (context is given in parentheses):
moped_fund_opp
(not used in current funding tab/functionality and was originally meant for unfunded needs to identify funding opportunities for projects that lacked funding)moped_group
(meant for grouping project-to-project relationships into hierarchies, not needed atm)moped_proj_communication
(original placeholder for indexing project messages/notes but irrelevant now: contains 1:1 relationship with moped_project and likely a few other relationships with some tables actually in use, but there are no rows for this table and it can be removed)moped_proj_dates
(original placeholder for indexing important project dates: we already have moped_proj_milestones
for this)moped_proj_fund_opp
(see moped_fund_opp
above)moped_proj_groups
(see moped_groups
above)moped_proj_location
(original placeholder for maintaining current geographies/coordinates, but that was implemented without this table)moped_proj_status_history
(made obsolete by activity log)moped_proj_status_notes
(made obsolete by project comments)moped_proj_timeline
(original placeholder for index of phases/statuses and enhanced timeline functionality involving linear progression of phases/statuses)Hopefully that helps with cutting down on some of the foreign keys that need reworked. @sergiogcx please don't hesitate to reach out if you run into any foreign keys that you're unsure of whether they follow the CASCADE pattern: happy to talk through it since I know sometimes just saying it aloud helps confirm the right approach on a case-by-case basis.
cc: @sergiogcx I've thought about this issue and reread your initial post, which I think I misread. I agree that CASCADE is the much more dangerous setting but that RESTRICT can be frustrating to developers in some cases.
Let's examine the main use cases for DELETE here:
Deleting a row in moped_project
, effectively the parent table of the entire database. moped_project
is indeed a child of what are essentially join tables in many-to-many relationships, but overall we would expect that if a row is deleted in moped_project
, we would CASCADE the delete to everywhere where there is a foreign key involving the project_id
. There may be one exception to this use case, which is our next use case.
Deleting a row in moped_project
that shares data with another moped_project
. For the most part, our project-related data is siloed to a single project (projects don't share files, teams, comments/notes, phases, or milestones). The exception here is project components (the mapping entities). Projects will absolutely share components moving forward and should be able to do so. However, because we have the many-to-many relationship set up, we would still expect to be able to delete rows in the join table without impacting another project's use of that same exact component. The only tricky part here is when we bring in project relationships, but we'll sideline that for the time being. I think we can still CASCADE the delete here because of how we've set up the structure, but would love for someone else to confirm this (@johnclary @sergiogcx maybe we can walk through the components data structure together, although I'll take a look at prod/staging to get an idea of what's going on here).
Deleting a row in one of the entities that is one-to-many or many-to-many with the project (e.g. moped_workgroup
as Sergio describes in the initial description of this issue, moped_phases
, moped_partners
, moped_sponsors
, etc.). In this case, these keys are referenced in either the moped_project
table directly (one-to-many) or in a join table in the many-to-many relationship. In these situations, I would expect to CASCADE the delete in the many-to-many (we don't need the row in the join table any longer) while SET NULL might be advantageous in the one-to-many relationship with moped_project
as users would expect a blank value if that entity no longer exists.
@sergiogcx @mateoclarke
I am very open to taking on this task in a future sprint (just not the next one) as I'm comfortable with SQL and should be able to sort this out. If there are any pressing foreign keys to deal with though (for example, to make sure the example @sergiogcx gave up in the description doesn't happen where deleting a workgroup accidentally cascades deleting the rest of the projects), could you take a look sooner and potentially fix those? We can probably just target the foreign keys that currently CASCADE.
@sergiogcx you gave plenty of examples here that I feel really comfortable taking a first pass on this the following sprint so thank you. Would love to take this tedious work off your plate. Will just want your thorough review on the eventual PR.
Let me know what you both think. I will definitely resize this to a 5 though because it's a lot of foreign keys and relationships to consider so it's going to take some time to think through it (actual code should be the easiest part since it's a lot of repetition).
@zappfyllc - I let @sergiogcx know that you were hoping to get this started the second week of this sprint.
@zappfyllc @sergiogcx thanks for your research on this. i'm inclined to set this as a blocker to #7775, since the most likely issues with the migration will be situations where we want to delete projects and try again.
i am fine with using set null
on relationship tables where it makes life easier. and i'm all for entirely deleting tables which are not implemented in the UI.
thanks @zappfyllc for offering to do this. i will defer to @sergiogcx on the code review, but if you have time in the next two weeks to open a PR, please do!
cc @amenity
@johnclary will do, I'll open up a PR this week.
More than often, we end up being unable to update or deleting records from the database due to our restrictive foreign keys. We need to evaluate what foreign keys need to be cascade and what others need to be kept with a restrictive policy.
Feel free to test using your SQL client on Local or on the Heroku instance, but for Staging and Production please generate a proper migration.
There is no way to simply change the foreign key, they have to be dropped and re-created which is an inexpensive task. For example:
We may run into situations such as moped_users_workgroup_id_fkey, where if we delete a workgroup id we may end up deleting all related projects via cascade. We do not want that, for this specific key we would like to the record deletion if the workgroup is deleted, and to update the record if the workgroup is updated.
From the docs:
This is the current list of constraints we currently have in the database: