geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

dbxrefs standardized #1909

Closed gocentral closed 9 years ago

gocentral commented 20 years ago

I sent this to the list.

Hi,

I've noticed that in the ontology file there are verious different dbxrefs refering to the same people, and also some differences in format of dbxrefs. I wondered if it would be okay to standardize these?

I have made a list of the dbxrefs with people's initials (bottom of e-mail), and I found some that I thought could be standardized or that are probably wrong.I listed them below with what I think they should be changed to. I'm not sure if there are two separate people in SGD with initials ct and clt.

Do people think it would be okay to standardize these and if so would you agree with the suggestions I've made?

Thanks,

Je

Suggestions:

old => new.

John_Garavelli:jsgaravelli@earthlink.net => RESID:jsg

KirillDegtyarenko:Nov091131152001 => Should this be GO:kd?

GO:p.kellum => GO:pk (both of these are in the file just now. Are they two separate people or could they be standardized to GO:pk?)

GOC:mah => GO:mah GOC:curators => GO:curators? (Should the GO editorial office people be GOC or GO?)

SGD:ct => SGD:clt? (Are SGD:ct and SGD:clt one person?)

GO:ec => GOA:ebc GO:ed => GOA:ecd GOA:ed => GOA:ecd (Should the GOA curators become GOA:? Right now they are sometimes GO: and sometimes GOA: Also Emily is sometimes 'ed' and sometimes 'ecd'.)

This is the whole list of debxrefs for people (unless I lost any when I was sorting them out):

CGD:mcc FB:bf FB:curators FB:fb FB:hb FB:ma GOA:ecd GOA:ed GO:ai GO:cb GOC:curators GOC:mah GO:curators GO:dph GO:ec GO:ecd GO:ED GO:jic GO:jl GO:ma GO:pk GO:p.kellum GR:pj GXD:dph John_Garavelli:jsgaravelli@earthlink.net KirillDegtyarenko:Nov091131152001 MGD:ap MGD:dph MGD:hjd MGI:ad MGI:ajp MGI:aledie MGI:curators MGI:dph MGI:dph MGI:hjd PSU:mb RGD:st Sanger:lmg Sanger:mb Sanger:vw SGD:as SGD:cb SGD:clt SGD:ct SGD:curators SGD:df SGD:elh SGD:jh SGD:kd SGD:krc SGD:mah SGD:mcc SGD:rc SGD:rn SGD:rnash SGD:se SP:ecd SP:jsg SP:kd SP:kd SP:njm SP:nn TAIR:curators TAIR:jy TAIR:jy TAIR:lm TAIR:lr TAIR:lr;yl TAIR:mg TAIR:pz TAIR:sm TAIR:syr TAIR:tb TIGR:cr TIGR:js TIGR:lh TIGR:mlg TIGR:sd WB:cab WB:ems WB:kmv WB:kmv WB:KVA ZFIN:dh ZFIN:sr

Becky wrote back to say that this is a typo and she'll remove it:

FB:fb

Jen

Reported by: jenclark

Original Ticket: "geneontology/ontology-requests/1914":https://sourceforge.net/p/geneontology/ontology-requests/1914

gocentral commented 20 years ago

Logged In: YES user_id=735846

From Michael:

KirillDegtyarenko:Nov091131152001 => Should this be GO:kd? .. yes

GO:p.kellum => GO:pk (both of these are in the file just now. Are they two separate people or could they be standardized to GO:pk?) ...same person, GOpk

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

from Karen Christie.

Hi Jen,

Sounds like a good idea to me.

I went through the SGD list to check all the abbreviations for us. I just have a few comments. I don't know if you need it, but since, in checking them out, I made the list below with names that correspond to each abbreviation I included it. The numbers are the numbers I get if I grep the gene_ontology.obo file for a given dbxref

  1. We agree with your suggestion to convert SGD:ct to SGD:clt, Chandra has used the clt form of her initials much more often than the ct form.

  2. In addition, SGD:rnash should become SGD:rn, since Rob only used the SGD:rnash form once and used the initial form much more often.

  3. SGD:rc is no one. We can't think of anyone at SGD with those initials and cannot find anyone in the Former GO people list from SGD who matches those initials. I think this should instead be SGD:rb for Rama.

thanks for doing this,

-Karen

SGD:as Anand Sethuraman SGD:cb Cathy Ball SGD:clt Chandra Theesfeld 60 SGD:ct " 7 SGD:curators all SGD:df Dianna Fisk SGD:elh Eurie Hong SGD:jh Jodi Hirschman SGD:kd Kara Dolinski SGD:krc Karen Christie SGD:mah Midori Harris SGD:mcc Maria Costanzo SGD:rc no one, (should be rb) 0 SGD:rn Rob Nash 20 SGD:rnash " 1 SGD:se Stacia Engel

not included on your list: SGD:rb Rama Balakrishan

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

Further e-mail feedback:

Hi Jen

I'll have to wait on this until everyone gets back from meetings, etc.

From my side, only the MGI:xxx are added since I do the editing. However, David might have used MGD at times. I don't know how MGD:ap got in there (tony never had write access). Also, I've always used MGI:adiehl for alex; I think the MGD:ad came from another very former person.

Anyhow, I'll see what we can resolve.

Harold.

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

Hi Jen,

In TAIR dbxref section, we have 2 entries for TAIR:jy. It is a duplication. So, please delete one of them.

Otherwise it looks fine.

Cheers, Suparna

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

GeneDB_Spombe: is to replace Sanger: in cases like Sanger:vw.

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

Hi,

These are the changes that have been requested:

correct (wrong)

GeneDB_Spombe:lmg (Sanger:lmg) GeneDB_Spombe:lvw (Sanger:vw) GeneDB_Spombe:mb (Sanger:mb) GO:curators (GOC:curators) GO:kd (KirillDegtyarenko:Nov091131152001) GO:mah (GOC:mah) GO:pk (GO:p.kellum) GOA:ebc (GO:ec, GOA:ec) GOA:ecd (GO:ed, GOA:ed, GO:ED) RESID:jsg (John_Garavelli:jsgaravelli@earthlink.net) SGD:clt (SGD:ct)
SGD:rb (SGD:rc)
SGD:rn (SGD:rnash)

This is the full list of refs:

CGD:mcc FB:bf FB:curators FB:hb FB:ma GeneDB_Spombe:lmg (Sanger:lmg) GeneDB_Spombe:lvw (Sanger:vw) GeneDB_Spombe:mb (Sanger:mb) GO:ai GO:cb GO:curators (GOC:curators) GO:dph GO:jic GO:jl GO:kd (KirillDegtyarenko:Nov091131152001) GO:ma GO:mah (GOC:mah) GO:pk (GO:p.kellum) GOA:ebc (GO:ec, GOA:ec) GOA:ecd (GO:ed, GOA:ed, GO:ED) GOA:vl GR:pj GXD:dph MGD: (MGI:)? MGD:ap MGD:dph MGD:hjd MGI:ad MGI:ajp MGI:aledie MGI:curators MGI:dph MGI:dph MGI:hjd PSU:mb RESID:jsg (John_Garavelli:jsgaravelli@earthlink.net) RGD:st SGD:as
SGD:cb
SGD:clt (SGD:ct)
SGD:curators SGD:df
SGD:elh SGD:jh
SGD:kd
SGD:krc SGD:mah SGD:mcc SGD:rb (SGD:rc)
SGD:rn (SGD:rnash)
SGD:se
SP:ecd SP:jsg SP:kd SP:kd SP:njm SP:nn TAIR:curators TAIR:jy TAIR:lm TAIR:lr TAIR:lr;yl TAIR:mg TAIR:pz TAIR:sm TAIR:syr TAIR:tb TIGR:cr TIGR:js TIGR:lh TIGR:mlg TIGR:sd WB:cab WB:ems WB:kmv WB:kmv WB:KVA ZFIN:dh ZFIN:sr

(uploaded below)

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

This dbxref needs replaced with an EC number:

NC-IUBMB: ProposedChangestotheEnzymeListconcerningATPasesandGTPases

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

Also this one:

NC-IUBMB:Revisions\

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

this should be a pmid:

TrendsBiochemSci12:p146-150

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

Another improved version is uploaded below.

Jen

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=473796

p.kellum refers to Paul Kellam, see http://www.ucl.ac.uk/windeyer- institute/Research/Kellam.htm

It should probably hence be UCL:pk

Similarly Ria_Holtzerland can be UCL:rh

GXD == MGD == MGI also PSU:mb == Sanger:mb

Original comment by: girlwithglasses

gocentral commented 20 years ago

Logged In: YES user_id=735846

Thanks Amelia,

I've added those in and uploaded the file.

Jen

Original comment by: jenclark

gocentral commented 20 years ago

latest dbxref list

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=735846

further updated version uploaded below.

Jen

Original comment by: jenclark

gocentral commented 20 years ago

Logged In: YES user_id=436423

> 3. SGD:rc is no one. We can't think of anyone at SGD with > those initials and cannot find anyone in the Former GO people > list from SGD who matches those initials. I think this should > instead be SGD:rb for Rama.

my guess would've been a letter missing from SGD:krc ;)

> I don't know how MGD:ap got in there (tony never had write access).

It doesn't matter if he had write access or not: if he made up a def, you could use his initials (change to MGI:ap or MGI:ajp).

> this should be a pmid: > TrendsBiochemSci12:p146-150

hmm, Trends Biochem Sci 24:146-150 is PMID:10322420 can't find any Volume 12 in PubMed ...

> GO:ma

should be either GO:mah or FB:ma ... I'd have to look at the entries to see if I recognize them (and if not, use FB:ma). Do you have a list?

m

Original comment by: mah11

gocentral commented 20 years ago

Logged In: YES user_id=735846

Hi Midori,

Terms with these defs have the ref GO:ma. I think if you want the term names then DAG-Edit might be the quickest way.

"The addition of an alkyl group to a protein amino acid. An alkyl group is any group derived from an alkane by removal of one hydrogen atom." [GO:ma] "The modification of peptidyl-aspartic acid." [GO:ma] "The modification of peptidyl-histidine." [GO:ma] "The covalent or noncovalent linking of a chromophore to a protein." [GO:ma] "The deamination of valine to form isobutyrate." [GO:ma] "Catalysis of the hydrolysis of various forms of polymeric ubiquitin-like sequences (e.g. APG8\, ISG15\, NEDD8\, SUMO). Will remove ubiquitin-like sequences from larger leaving groups." [GO:ma] "The formation from simpler components of any amino acid that does not normally occur as a constituent residue of proteins." [GO:ma] "The formation from simpler components of L-ascorbic acid; L-ascorbic acid ionizes to give L-ascorbate\, which is required as a cofactor in the oxidation of prolyl residues to hydroxyprolyl\, and other reactions." [ISBN:0198547684, GO:ma] "Interacting selectively with an immunoglobulin." [GO:ma] "The formation from simpler components of diaminopimelate\, both as an intermediate in lysine biosynthesis and as a component (as meso-diaminopimelate) of the peptidoglycan of Gram-negative bacterial cell walls." [ISBN:0198547684, GO:ma] "The covalent linking of a chromophore to a protein via peptidyl-cysteines." [GO:ma] "The modification of the aminoacyl group of a charged tRNA." [GO:ma] "Interacting selectively with diacylglycerol\, a diester of glycerol and two fatty acids." [GO:ma] "The processes that result in the fragmentation by proteolysis of antigens and the association of the resulting peptides with MHC molecules." [GO:ma] "The processes of biogenesis and assembly of the ribosome and its subunits." [GO:ma] "The assembly of the mature ribosome and of its subunits." [GO:ma] "The assembly of the large and small ribosomal subunits into a functional ribosome." [GO:ma] "The orientation of free radical substrates in such a way that only a particular stereoisomer is synthesized by an enzyme. Best characterized as a function during lignan biosynthesis." [GO:ma] "The formation from simpler components of GDP-L-fucose from GDP-D-mannose via GDP-4-dehydro-6-deoxy-D-mannose\, requiring the functions of GDP-mannose 4\,6-dehydratase (EC:4.2.1.47) and GDP-4-dehydro-D-rhamnose reductase (EC:1.1.1.187)." [GO:ma] "The formation of GDP-L-fucose from L-fucose\, without de novo synthesis. L-fucose is phosphorylated by fucokinase and then converted by fucose-1-phosphate guanylyltransferase (EC:2.7.7.30)." [GO:ma]

Jen

Original comment by: jenclark

gocentral commented 19 years ago

Logged In: YES user_id=735846

The curator dbxrefs have been standardized and the file with the list of standard dbxrefs is in CVS folder 'doc'.

Jen

Original comment by: jenclark

gocentral commented 19 years ago

Original comment by: jenclark