ispyb / ispyb-database-modeling

4 stars 3 forks source link

Deleted column for Shipping, Dewar, Container and BLSample #15

Closed KarlLevik closed 6 years ago

KarlLevik commented 7 years ago

We'd like the users to be able to tag a shipment, dewar, container or sample as 'deleted' without actually deleting the row from the database as there could be other things elsewhere in the database referring to that row. For example, there could have been data collected against a BLSample already.

The effect of tagging one of these as deleted should be that it is hidden from the UI, the BCM and anything else that might make use of it.

To be able to tag one of these as deleted, we think the best solution is to have a 'deleted' column in each of the tables:

ALTER TABLE Shipping 
  ADD deleted boolean DEFAULT False COMMENT 'Flag to indicate the user has discarded this Shipping';

ALTER TABLE Dewar 
  ADD deleted boolean DEFAULT False COMMENT 'Flag to indicate the user has discarded this Dewar';

ALTER TABLE Container
  ADD deleted boolean DEFAULT False COMMENT 'Flag to indicate the user has discarded this Container';

ALTER TABLE BLSample
  ADD deleted boolean DEFAULT False COMMENT 'Flag to indicate the user has discarded this BLSample';
antolinos commented 6 years ago

Hi @KarlLevik,

For shipments, dewars and containers I wonder if we could use the column status instead of adding a new extra column.

KarlLevik commented 6 years ago

@antolinos - I think we'd prefer new, dedicated columns. If we use the status columns for this, then the 'deleted' flag will either have to overwrite the current status value (which is not good as we'd lose data), or it will have to co-exist with existing status values (maybe comma-separated), thereby complicating searching and filtering.

antolinos commented 6 years ago

Hi, Well, we are storing the status of dewars on DewarTransportHistory. Not sure about if we are using the status at other levels correctly.

My worries are:

  1. Deleted is not generic at all. To be honest I don't know other use cases but sure they will appear in the future. I could imagine: 'lost', 'contaminated', 'destroyed', 'invalid'. I don't know if they have got sense but It would mean that we would need to add flags everywhere.

  2. If we want to keep data consistent we need to perform many operations. For instance, if I "delete" a shipment I guess I should flag as deleted all children in cascade (?)

Do you know if there is other way to make these flags more generic? Do we really need these flags to exist at all the levels Shipment-Dewar-Container-Sample?

I agree that comma-separated is not an option.

drnasmith commented 6 years ago

Yes, Dewars and Containers both have history tables (ContainerHistory and DewarTransportHistory). They can be used to preserve data, so the status column could be used in those examples. Shipping and BLSample don't have any status though...

KarlLevik commented 6 years ago

@antolinos Yes, we may need other statuses in the future. Some mutually exclusive, some not. Maybe it makes sense that the mutually exclusive statuses go into the same status column, whereas the others that can exist simultaneously should have their own columns?

Also yes, I think we need a deleted flag on all levels of Shipment - Dewar - Container - Sample. For example, a user may have made a mistake when creating a sample (e.g. created a duplicate or an extra sample somehow), so they would want to delete that, but not the rest of the shipment. The same could happen at other levels: Accidentally created an extra container, dewar, or even an entire shipment.

KarlLevik commented 6 years ago

@antolinos If we do want to have a status column that can hold multiple, simultaneous statuses, then it would be possible to use the SET data type. It's similar to ENUM, except you can have more than one value at a time. I think MySQL has the best documentation page for this, though it works exactly the same in MariaDB: https://dev.mysql.com/doc/refman/5.7/en/set.html . (This is not a new feature, it has been around since at least MySQL / MariaDB 5.5.)

stufisher commented 6 years ago

1.Deleted is not generic at all. To be honest I don't know other use cases but sure they will appear in the future. I could imagine: 'lost', 'contaminated', 'destroyed', 'invalid'. I don't know if they have got sense but It would mean that we would need to add flags everywhere.

This is solely to tell the UI not to display this container, this is not to be used to described the physical status of the item in question. Maybe "hidden" is a better identifier.

The status column should be used for the other use cases you mention. "lost", "damaged", etc.

Please do not make the status column hold multiple values, this will make querying and relying on these statuses much more complicated.

delageniere commented 6 years ago

Hi, I am not sure what are the use cases here: If you just want to hide a container from the display, then only a "hidden" flag attached to Container would be OK. If you want to handle "user deletion" but with keeping data in the database (for what purpose ?) then if we go for this we need to handle the "cascade user deletion" as well on children and I am afraid it may become a "usine à gaz" as we say in French, or a nightmare.

For me : either collections have been done and then no deletion possible, we keep data in database, and no "user deletion" possible, or no collections have been done and then "user deletion" possible but then why not do a real deletion ? My concerns are as I said during the meeting, we used to do this in the ESRF User Portal, and we decided not to do it any longer, and we are happy to get rid of these "fake" deletions.

Or I missed some other use cases ?

antolinos commented 6 years ago

I agree @stufisher

Please do not make the status column hold multiple values, this will make querying and relying on these statuses much more complicated.

If it is the best solution ok for having theses hidden columns in all levels. However we consider it complicate the logic specially when passing the samples to the BCM (I guess at some point the hidden samples, containers, dewars and shipments will need to be filtered).

We will most likely only support hidden status at level of shipment.

KarlLevik commented 6 years ago

I agree with @delageniere - let's "hard" delete. We just need to make sure the database schema doesn't allow cascading deletes, and then the applications need to handle the error whenever the database says you're not allowed to delete because child rows exist. Or the application needs to deliberately delete those child rows as well.

There was agreement at the ISPyB meeting in Trieste to use this solution.

stufisher commented 5 years ago

Do not agree. Onus is on you guys to check all foreign keys for cascade. I will not take responsibility for loss of data through database issues, or exploitation of the UI

KarlLevik commented 5 years ago

That's fine, @stufisher , just warn us before you implement delete features so I can go over the relevant FK definitions and make sure they're sane.