edx / edx-arch-experiments

A plugin to include applications under development by the architecture team at edx
GNU Affero General Public License v3.0
0 stars 3 forks source link

Discovery and clean-up of unused mysql tables #447

Open robrap opened 1 year ago

robrap commented 1 year ago

The following list of tables comes from comparing mysql tables in devstack, stage, and prod. Some tables such as askbot_* were found to no longer be needed, and were causing issues for the MySQL 8 upgrade. Cleaning out more unused tables may help.

Important notes:

  1. Some tables may be required by Data Engineering (e.g. shoppingcart_*). Data Engineering should be asked before cleaning out any tables.
  2. Some private dependencies might be deployed to stage/prod only, and would not show up in devstack.
  3. https://github.com/edx/edx-arch-experiments/issues/331 links to a DEPR that names many auth related tables that can be removed. Hopefully these same tables appear in the the "not in devstack" list below.
  4. I did not yet look at Edge tables. Simplest may be to compare Edge to Prod.
Devstack only:
save_for_later_savedcourse
save_for_later_savedprogram

Not in Devstack (Prod/Stage only):
ai_aside_aiasidecourseenabled
ai_aside_aiasideunitenabled
assessment_aiclassifier
assessment_aiclassifierset
assessment_aigradingworkflow
assessment_aitrainingworkflow
assessment_aitrainingworkflow_training_examples
auth_message
auth_user_groups_copy
auth_user_groups_copy_post_ok
ccx_ccxfieldoverride
ccx_ccxfuturemembership
ccx_ccxmembership
ccx_customcourseforedx
ccxcon_ccxcon
celery_taskmeta
celery_tasksetmeta
celery_utils_chorddata
celery_utils_chorddata_completed_results
coaching_historicalusercoaching
coaching_usercoaching
comment
coursetalk_coursetalkwidgetconfiguration
django_openid_auth_association
django_openid_auth_nonce
django_openid_auth_useropenid
djcelery_crontabschedule
djcelery_intervalschedule
djcelery_periodictask
djcelery_periodictasks
djcelery_taskstate
djcelery_workerstate
djkombu_message
djkombu_queue
event_routing_backends_routerconfiguration
external_auth_externalauthmap
federated_content_connector_coursedetails
federated_content_connector_coursedetailsimportstatus
foldit_puzzlecomplete
foldit_score
followit_followuser
learning_assistant_courseprompt
licenses_coursesoftware   
licenses_userlicense
linkedin_linkedin
lx_pathway_plugin_pathway
mentoring_answer
notices_acknowledgednotice
notices_historicalacknowledgednotice
notices_historicalnotice
notices_historicaltranslatednoticecontent
notices_notice
notices_translatednoticecontent
notifications_articlesubscription
oauth2_accesstoken
oauth2_client
oauth2_grant
oauth2_provider_trustedclient
oauth2_refreshtoken
oauth_provider_consumer
oauth_provider_nonce
oauth_provider_scope
oauth_provider_token
problem_builder_answer
problem_builder_share
robots_rule
robots_rule_allowed
robots_rule_disallowed
robots_rule_sites
robots_url
self_paced_selfpacedconfiguration
simplewiki_article
simplewiki_article_related
simplewiki_articleattachment
simplewiki_namespace
simplewiki_permission
simplewiki_permission_can_read
simplewiki_permission_can_write
simplewiki_revision
student_courseaccessrole_copy
third_party_auth_providerapipermissions
verified_track_content_migrateverifiedtrackcohortssetting
verified_track_content_verifiedtrackcohortedcourse

Stage only:
askbot_badgedata
askbot_emailfeedsetting
askbot_post
askbot_thread
backup_coursemode
backup_coursemodesarchive
backup_historicalverificationdeadline
backup_verificationdeadline
lti_provider_gradedassignment
lti_provider_lticonsumer
lti_provider_ltiuser
lti_provider_outcomeservice
student_anonymoususerid_temp_archive_cale
temp_username_userid

Prod only:
django_openid_auth_useropenid_removed
shoppingcart_certificateitem                             # shoppingcart tables required by Data Engineering
shoppingcart_coupon
shoppingcart_couponredemption
shoppingcart_courseregcodeitem
shoppingcart_donation
shoppingcart_order
shoppingcart_orderitem
shoppingcart_paidcourseregistration
adzuci commented 1 year ago

Thanks for putting this list together @robrap!!

@farhanumar, please review this list in case you see any tables that you think will cause additional MySQL 8 road blocks. Jeremy and I think the wiki tables are empty and not used so we can drop those along with the askbot tables in stage and/or prod (still need to check edge.)

I think the rest of the tables need to be reviewed one by one as some things should get dropped and some are in stage/prod intentionally (e.g. the ccx tables.)

Given the context I have and the fact we have done a test upgrade of a stage and prod cloned cluster I think most of this work can wait till after the MySQL 8 upgrades, but seeing what is inconsistent in edge could help Farhan out ahead of the edge upgrades.