Open blakesweeney opened 2 years ago
There's two tables that look like they do the same (ish) thing - rnc_secondary_structure
contains the structure, md5 and accession, but rnc_secondary_structure_layout
contains all the stuff with the model hits etc in. I'm guessing we want to empty the ..layout
table? The structures don't appear to match between the two tables either?
So this is a good chance to fix an issue with our database naming. the rnc_secondary_structure
table is the result of getting 2D's when parsing. I think only 1 or 2 (gtRNAdb, CRW) databases provide it. This table is also likely not to be updated now that we have r2dt. I'd have to check the pipeline to confirm though. This one does not need to be deleted for this task.
The layout table is the one that is a result of r2dt and is the one to be emptied. It would probably we worthwhile to rename tables to reflect the differences between them. Maybe some name prefixed with r2dt?. I'll leave it up to you and @carlosribas to decide on naming.
rnc_secondary_structure_layout
has been backed up in rnc_secondary_structure_layout_backup
and truncated.
I'll leave the issue open so we can figure out how to rename things when the rescan is done
This table stores the metadata about results (2D pairs, score, model, coordinates, etc) and it should be emptied out. R2DT has changed and we may (and likely will) have some sequences without hits. The only way to ensure we don't have mixed data is to remove all the old data.