Open AntonPetrov opened 7 years ago
There are at least 23 xrefs/accessions with this issue. We can find them by doing:
select
*
from xref, rnc_accessions acc
where
xref.ac = acc.accession
and acc.database = 'MODOMICS'
and (
(species = 'Xenopus laevis' and taxid != 8355)
or (species = 'Rattus norvegicus' and taxid != 10116)
or (species = 'Zea mays' and taxid != 4577)
or (species = 'Thermus thermophilus' and taxid != 274)
or (species = 'Salmonella typhimurium' and taxid != 90371)
or (species = 'Phaseolus vulgaris' and taxid != 3885)
or (species = 'Oryctolagus cuniculus' and taxid != 9986)
or (species = 'Triticum aestivum' and taxid != 4565)
)
;
it appears to be limited to modomics as doing the search without the modomics constraint gives the same results.
Fixing the accessions can be done with:
-- Update Xenopus
update xref
set taxid = 8355
where
ac in ('dd7318229bd33f71098d491b437b97dd_modomics',
'5b638f7a6fb817e74ea1fc05eb7aca6a_modomics')
and taxid != 8355
;
-- Update rat
update xref
set taxid = 10116
where
ac in ('b5f224875fe4b55c0f8c79ff9e1c4b96_modomics')
and taxid = 10090
;
-- Update maize
update xref
set taxid = 4577
where
ac in ('331f68e0cd1ed4d69e6ce052f24d432c_modomics')
and taxid = 3562
;
-- Update thermus
update xref
set taxid = 274
where
ac in ('367fff4928ff6e45035eccd25315ae9d_modomics',
'3fe73d3ce3932d8042cec7866b809ac0_modomics')
and taxid = 300852
;
-- Update Salmonella
update xref
set taxid = 90371
where
ac in ('3c9f4214774de3fd21eb099235728829_modomics')
and taxid = 562
;
-- Update kidney bean
update xref
set taxid = 3885
where
ac in ('ae35295009e8a43132a40112333bbcc1_modomics')
and taxid = 3847
;
-- Update rabbit
update xref
set taxid = 9986
where
ac in ('484a32153536ff19c456a10b99106d82_modomics',
'2b7215b44be5fa48144480e645e1d4b1_modomics',
'484a32153536ff19c456a10b99106d82_modomics')
and taxid != 9986
;
-- Update wheat
update xref
set taxid = 4565
where
ac in ('1cc28fdd9201cb2ab75c52dd846b649f_modomics',
'851cde158600ec1bb7cb41827451d795_modomics',
'512c6fee503aae504bbab970efa856a1_modomics',
'9499216fa6efa3c4f2b9a7bbb8dc1548_modomics',
'1dd5a0399007891f1ec154143a319d21_modomics')
and taxid != 4565
;
At least one species-specific id shows several species at once (example):
The problem happens when the same sequence has the same modifications in multiple species (and the Accession model gets overwritten), so Modomics data needs to be reimported from scratch.
The problem was originally reported by Sean.