datajoint / datajoint-python

Relational data pipelines for the science lab
https://datajoint.com/docs
GNU Lesser General Public License v2.1
169 stars 85 forks source link

Drop schema error specificity #1032

Open CBroz1 opened 2 years ago

CBroz1 commented 2 years ago

Feature Request

Problem

If I attempt to drop a schema referenced by a foreign key constraint, DataJoint returns a pymysql error that is difficult to parse for the average user not familiar with reading the backend snake case. In discussing other code that would catch this error, @dimitri-yatsenko suggested that "external code should not rely on pymysql errors because we may replace pymysql with another database connector"

dj.schema('prefix_abc')
OperationalError: (3730, "Cannot drop table '#a_b_c' referenced by a foreign key constraint '_table_position_ibfk_2' on table '_table_position'.")

Requirements

This feature should catch the underlying OperationalError and translate to camel case for the user as a DataJointError

Justification

Adding this feature would prevent a break in external code if DataJoint replaced pymysql

Alternative Considerations

Current workaround is to try/except the pymyql error

guzman-raphael commented 2 years ago

@CBroz1 Thanks for the feedback. In reviewing the ask and the snippet you shared, I'd recommend having your drop_schemas feature actually iterate through the tables in the diagram instead; bottom up. That way, you can process through where there would be less foreign key issues and remove the schema once all tables within the schema have been removed. schema.list_tables() should return in topological order which could be useful for this kind of iteration. You could also get a sense of other schemas/tables that are involved.

Basically, do what you can to avoid deleting a table before removing other tables that have foreign keys into it. :smile:

dimitri-yatsenko commented 2 years ago

I agree that Table.drop should raise a DataJointError, not a pymysql.Exception.

With that, cascading drop for schemas should not be supported in any form -- it's a hazardous operation. Perhaps we should prevent Schema.drop from dropping a schema that contains any tables other than the auxiliary tables ~jobs and ~external

The tables inside schemas should be dropped first. Then their schemas can be dropped in any order.