RNAcentral / rnacentral-import-pipeline

RNAcentral data import pipeline
Apache License 2.0
2 stars 1 forks source link

Removes updating the rnc_sequence_regions.providing_databases column #188

Closed blakesweeney closed 9 months ago

blakesweeney commented 9 months ago

This change is to remove the rnc_sequence_regions.providing_databases column. It removes all updating logic for it. However, as an effect of this it has a subtle effect on how we need to fetch and display known coordinates. What this will do is no longer delete inactive coordinates. This is ok because we map coordinates to accessions (via rnc_accession_sequence_region) and track which accessions are active (via xref) so we can always tell which provided coordinates are active.

This means though that we need to be careful when extracting active coordinates as we have to check if the xrefs are active. This is a major change from before where we always assumed all entries in the table are active. I think this is ok, as the reduction in logic will be fine and the databases that provide the majority of our coordinates don't obsolete that many coordinates, I hope.