Closed blakesweeney closed 1 year ago
It looks like the bug you mentioned has indeed been fixed, as I haven't found sequences that contain U
and have xref active. In order to do that, I ran:
SELECT r.upi FROM rna r JOIN xref x ON r.upi=x.upi WHERE r.seq_short like '%U%' and x.deleted='N' LIMIT 10;
I also performed this query using the seq_long
field
Looks like it is fixed to me then. Thanks!
At one point, I created a bug in import where I was importing RNA sequences (contains U's not T's) instead of DNA (T's not U's) ones. This put invalid data into the
rna
table. That bug should be fixed, for Rfam at least, but we should verify that this is true. To do this we need to check that all sequences in therna
table which contain aU
have no active xrefs. If they do have some we need to look into which databases are causing this. I think there may be some cases where the U is present but the sequence is DNA, but those should be small and we will have to manually verify each one.