Open brindakv opened 8 months ago
Add two new unique constraints to Entry_Related_File
:
File_Type
is not null --> verifiedFile_md5
is not null. File_URL
, File_Name
, and File_md5
mandatory (currently not mandatory; but annotation for the File_URL
column is tagged with tag:isrd.isi.edu,2018:required
)ihm_starting_model_details
can remain as is.
There are currently conflicts for the (RCB, structure_id, File_Type)
UNIQUE CONSTRAINT:
_ermrest_catalog_1=> select "RID", "File_Name", "RCB", "structure_id", "File_Type" from "Entry_Related_File" where ("RCB", "structure_id", "File_Type") in
(select "RCB", "structure_id", "File_Type" from "Entry_Related_File" group by "RCB", "structure_id", "File_Type" having count(*) > 1)
order by "structure_id"
;
RID | File_Name | RCB | structure_id | File_Type
--------+--------------------------------------------------+--------------------------------------------------------------+--------------+--------------------------------------
2-FH0W | Chemical-Crosslinks-from-Experiments_DMTMMv2.csv | https://auth.globus.org/6b934fbb-1e96-4664-a02f-f328711dda8d | D_2-6W26 | Chemical Crosslinks from Experiments
2-FH0T | Chemical-Crosslinks-from-Experiments_BS3.csv | https://auth.globus.org/6b934fbb-1e96-4664-a02f-f328711dda8d | D_2-6W26 | Chemical Crosslinks from Experiments
2-FGZY | Chemical-Crosslinks-from-Experiments_DMTMM.csv | https://auth.globus.org/6b934fbb-1e96-4664-a02f-f328711dda8d | D_2-6W28 | Chemical Crosslinks from Experiments
2-FGZW | Chemical-Crosslinks-from-Experiments_BS3.csv | https://auth.globus.org/6b934fbb-1e96-4664-a02f-f328711dda8d | D_2-6W28 | Chemical Crosslinks from Experiments
(4 rows)
In other words, we have multiple files for the same (RCB, structure_id, File_Type)
.
@svoinea to review the backend code to find how deletion is carried out before insertion to understand why the error happens.
@brindakv to update tool tip "Only one file of a particular type can be uploaded".
@brindakv to review RCT and content of Entry_Related_File
and corresponding ermrest table in entries D_2-6W26
and D_2-6W28
on production and delete duplicate instances.
As I have anticipated, the backend code is deleting just the entries of the table. I am pasting the sequence of code:
"""
Empty first the tname table
"""
self.logger.debug('Checking if PDB:{}/structure_id={} is empty'.format(tname, entry_id))
url = '/entity/PDB:{}/structure_id={}'.format(tname, entry_id)
resp = self.catalog.get(
url
)
resp.raise_for_status()
deleted_rows = resp.json()
if len(deleted_rows) > 0:
try:
"""
The table is not empty
"""
url = '/entity/PDB:{}/structure_id={}'.format(tname, entry_id)
resp = self.catalog.delete(
url
)
Deleting the older records of the Entry_Related_File
table is OK.
I have already done it on catalog 99
and 50
.
Thanks @svoinea.
I fixed the duplicate records with the same File_Type
in D_2-6W26
and D_2-6W28
.
Added the following constrains on all the catalogs:
Created Key Entry_Related_File_RCB_structure_id_File_Type_key for table PDB:Entry_Related_File
Created Key Entry_Related_File_RCB_structure_id_File_MD5_key for table PDB:Entry_Related_File
Set nullok=False in column PDB:Entry_Related_File:File_Name
Set nullok=False in column PDB:Entry_Related_File:File_URL
Set nullok=False in column PDB:Entry_Related_File:File_MD5
Tooltip updated on dev
.
@svoinea to update if the unique constraints are applied on dev and production.
@monica81567 to test on production.
I am mvoing this to Review.
@monica81567 please test this on production.
Entry_Related_File
andihm_starting_model_details
are tables where users can upload files belonging to an entry, in addition to the primary mmCIF and image files.Unique constraints in these tables need to reviewed.
Current for
Entry_Related_File
, multiple records with the sameFile_Type
is allowed. However, when a new record is created using an already existingFile_Type
that is onRecord_Ready
(i.e., data is already populated in the corresponding table), then the data is deleted before loading data from the new file.Entry_Related_File
will show two records with the sameFile_Type
with data from only one loaded into the database.To properly enforce the correct behavior, (RCB, Structure_Id, File_Type) unique constraint is required so that multiple records with the same
File_Type
are not created. User can edit the existing record to fix any errors (ERROR status) or to update data (RECORD READY status).Note: Although data is deleted before loading data from new file of the same
File_Type
, ermrest still throws an error regarding duplicate primary keys if the a new record is created with a file that has the sameid
in the uploaded files as the data in ermrest. If this is done by editing an existing record, there is no error. This behavior also suggests the need for (RCB, Structure_Id, File_Type) unique constraint.