PROconsortium / PRoteinOntology

Other
12 stars 3 forks source link

PR:Trembl records to allow in files #128

Closed nataled closed 5 years ago

nataled commented 7 years ago

This is mostly a list of ids that UniProt uses in their mouse. GAF for isoforms. The ids are in PRO but are "dynamic", and not in any of the files (working version or full).

I have attached the file along with a PMID to the reference that was used to make the annotation as proof of existence. Adding them to the obo files will allow us to convert the ids to PRO ids as we load the annotations.

A few of the ids have been used by our curators (old), but the id is a UniProt ID; once the PRO equivalent is released to the obo files we can have these in our GPI files, and then convert them to PRO ids in our interfaces

Reported by: hdrabkin

nataled commented 7 years ago

Original comment by: nataled

nataled commented 7 years ago

Looking at a few of these, I note that they cannot be made automatically, and cannot be simply imported from the dynamic term maker.

Original comment by: nataled

nataled commented 7 years ago

Ok Will look at this tomorrow Ceci ----- Original Message ----- From: Darren Natale darren_natale@users.sf.net To: [pro-obo:term-requests] 117@term-requests.pro-obo.p.re.sf.net Sent: Thu, 01 Jun 2017 17:48:20 -0400 (EDT) Subject: [pro-obo:term-requests] #117 PR:Trembl records to allow in files

Looking at a few of these, I note that they cannot be made automatically, and cannot be simply imported from the dynamic term maker.


[term-requests:#117] PR:Trembl records to allow in files

Status: accepted Group: Created: Wed May 31, 2017 08:28 PM UTC by Harold J. Drabkin Last Updated: Thu Jun 01, 2017 03:51 PM UTC Owner: Cecilia Arighi Attachments:

This is mostly a list of ids that UniProt uses in their mouse. GAF for isoforms. The ids are in PRO but are "dynamic", and not in any of the files (working version or full).

I have attached the file along with a PMID to the reference that was used to make the annotation as proof of existence. Adding them to the obo files will allow us to convert the ids to PRO ids as we load the annotations.

A few of the ids have been used by our curators (old), but the id is a UniProt ID; once the PRO equivalent is released to the obo files we can have these in our GPI files, and then convert them to PRO ids in our interfaces


Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/pro-obo/term-requests/117/

To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ -- Cecilia Arighi, PhDORCID:0000-0002-0803-4817Research Associate Professor Center of Bioinformatics and Computational Biology Department of Computer and Information Sciences University in Delaware

UniProt: http://www.uniprot.orgBioCreative: http://www.biocreative.orgProtein Ontology: http://proconsortium.org/ PIR: http://proteininformationresource.org/International Society for Biocuration: http://biocuration.org

Original comment by: carighi

nataled commented 7 years ago

The first example (PR:Q9ERP2) in the file need review by MGI. Q9ERP2 corresponds to the N terminus of the Short isoform, it has a distinct N terminal fragment, and could correspond to isoform 1 or 2 (see Figure 1 attached). Also the PMID:64996 provided is incorrect

If the intent is to annotate the short isoform, a union term should be created. encompassing isoform 1 and 2

For the other isoform the long, this is the term

[Term] id: PR:M0QWP1 name: agrin isoform m4 (mouse) def: "An agrin (mouse) that is a translation product of some mRNA giving rise to a protein with the amino acid sequence represented by UniProtKB:M0QWP1." [PRO:CNA, PMID:11018052] comment: Category=organism-sequence. synonym: "mAGRN/iso:m4" EXACT PRO-short-label [PRO:DNx] synonym: "LN-agrin (mouse)" EXACT [PMID:11018052] synonym: "agrin long NH2 terminus isoform (mouse)" EXACT [] xref: UniProtKB:M0QWP1 is_a: PR:A2ASQ1 ! agrin (mouse) relationship: only_in_taxon NCBITaxon:10090 ! Mus musculus

From: "Darren Natale" darren_natale@users.sf.net To: "[pro-obo:term-requests]" 117@term-requests.pro-obo.p.re.sf.net Sent: Thursday, June 1, 2017 11:51:10 AM Subject: [pro-obo:term-requests] #117 PR:Trembl records to allow in files

* status : open --> accepted 
* assigned_to : Cecilia Arighi 
* Group : --> 

[term-requests:#117] PR:Trembl records to allow in files

Status: accepted Group: Created: Wed May 31, 2017 08:28 PM UTC by Harold J. Drabkin Last Updated: Wed May 31, 2017 08:28 PM UTC Owner: Cecilia Arighi Attachments:

* DynamicPRO.txt (2.2 kB; text/plain) 

This is mostly a list of ids that UniProt uses in their mouse. GAF for isoforms. The ids are in PRO but are "dynamic", and not in any of the files (working version or full).

I have attached the file along with a PMID to the reference that was used to make the annotation as proof of existence. Adding them to the obo files will allow us to convert the ids to PRO ids as we load the annotations.

A few of the ids have been used by our curators (old), but the id is a UniProt ID; once the PRO equivalent is released to the obo files we can have these in our GPI files, and then convert them to PRO ids in our interfaces

Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/pro-obo/term-requests/117/

To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

Original comment by: carighi

nataled commented 7 years ago

Here is a look at the first few data is as follows: MGI column data|name&PMID|PR assigned by PRO|PRO assessment|HIERARCHY term level The last three columns are my edits

PR:Q9ERP2; Agrn; dynamic PMID:64996 incorrect PMID, should be union of isoform 1 and isoform 2 PR:M0QWP1 Agrn dynamic PMID:11018052 PR:M0QWP1 done organism-sequence UniProtKB:A0A0R4J0I9 Lrp1 dynamic PR:Q91ZX7 PMID missing, sequence is identical to SP entry. organism-gene UniProtKB:A8R0T9 Esp6 PR version dynamic PMID:17935991 WRONG ACCESSION, ESP6 is A8R0U0 (see suppl material seq S.1) UniProtKB:A8R0U0 Esp6 PR version dynamic PMID:17935991 PR:A8R0U0 DONE organism-gene UniProtKB:B8QI33 Ppfia1 dynamic; PMID:19013515 PR:B8QI33 DONE, seq fig 2 organism-gene UniProtKB:B8QI36 Ppfia4 dynamic PMID:19013515 PR:B8QI36 DONE, seq fig 2 organism-gene UniProtKB:D3Z2B1 Lrp11 PMID:25262641 why this Accession over the SP Q8CB67? UniProtKB:D3Z3A7 Lrp11 PMID:25262641 why this Accession over the SP Q8CB67?

From: "Cecilia Arighi" arighi@users.sf.net To: "[pro-obo:term-requests]" 117@term-requests.pro-obo.p.re.sf.net Sent: Friday, June 2, 2017 9:19:21 AM Subject: [pro-obo:term-requests] Re: #117 PR:Trembl records to allow in files

The first example (PR:Q9ERP2) in the file need review by MGI. Q9ERP2 corresponds to the N terminus of the Short isoform, it has a distinct N terminal fragment, and could correspond to isoform 1 or 2 (see Figure 1 attached). Also the PMID:64996 provided is incorrect

If the intent is to annotate the short isoform, a union term should be created. encompassing isoform 1 and 2

For the other isoform the long, this is the term

[Term] id: PR:M0QWP1 name: agrin isoform m4 (mouse) def: "An agrin (mouse) that is a translation product of some mRNA giving rise to a protein with the amino acid sequence represented by UniProtKB:M0QWP1." [PRO:CNA, PMID:11018052] comment: Category=organism-sequence. synonym: "mAGRN/iso:m4" EXACT PRO-short-label [PRO:DNx] synonym: "LN-agrin (mouse)" EXACT [PMID:11018052] synonym: "agrin long NH2 terminus isoform (mouse)" EXACT [] xref: UniProtKB:M0QWP1 is_a: PR:A2ASQ1 ! agrin (mouse) relationship: only_in_taxon NCBITaxon:10090 ! Mus musculus

From: "Darren Natale" darren_natale@users.sf.net To: " [pro-obo:term-requests] " 117@term-requests.pro-obo.p.re.sf.net Sent: Thursday, June 1, 2017 11:51:10 AM Subject: [pro-obo:term-requests] #117 PR:Trembl records to allow in files

[term-requests:#117] PR:Trembl records to allow in files

Status: accepted Group: Created: Wed May 31, 2017 08:28 PM UTC by Harold J. Drabkin Last Updated: Wed May 31, 2017 08:28 PM UTC Owner: Cecilia Arighi Attachments:

This is mostly a list of ids that UniProt uses in their mouse. GAF for isoforms. The ids are in PRO but are "dynamic", and not in any of the files (working version or full).

I have attached the file along with a PMID to the reference that was used to make the annotation as proof of existence. Adding them to the obo files will allow us to convert the ids to PRO ids as we load the annotations.

A few of the ids have been used by our curators (old), but the id is a UniProt ID; once the PRO equivalent is released to the obo files we can have these in our GPI files, and then convert them to PRO ids in our interfaces

Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/pro-obo/term-requests/117/

To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

[term-requests:#117] PR:Trembl records to allow in files

Status: accepted Group: Created: Wed May 31, 2017 08:28 PM UTC by Harold J. Drabkin Last Updated: Thu Jun 01, 2017 09:48 PM UTC Owner: Cecilia Arighi Attachments:

* DynamicPRO.txt (2.2 kB; text/plain) 

This is mostly a list of ids that UniProt uses in their mouse. GAF for isoforms. The ids are in PRO but are "dynamic", and not in any of the files (working version or full).

I have attached the file along with a PMID to the reference that was used to make the annotation as proof of existence. Adding them to the obo files will allow us to convert the ids to PRO ids as we load the annotations.

A few of the ids have been used by our curators (old), but the id is a UniProt ID; once the PRO equivalent is released to the obo files we can have these in our GPI files, and then convert them to PRO ids in our interfaces

Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/pro-obo/term-requests/117/

To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

Original comment by: carighi

nataled commented 7 years ago

I should probably add my notes to this thread. I formatted an Excel spreadsheet. I find that at least one gene is likely incorrect, and there are some surprises because of a bug in the dynamic generator that fails to report when the term already exists in PRO but under a different accession. I would check all these by hand, including alignments and checks of UniParc.

Original comment by: nataled

nataled commented 5 years ago

I still need an id for the SN form for PMID:11018052 of Agrn (SN form)

Original comment by: hdrabkin

nataled commented 5 years ago

SN-agrin is just the canonical isoform. I added the reference and synonym.

[Term] id: PR:A2ASQ1-1 name: agrin isoform m1 (mouse) def: "An agrin (mouse) that is a translation product of some mRNA giving rise to a protein with the amino acid sequence represented by UniProtKB:A2ASQ1-1." [PRO:DNx, UniProtKB:A2ASQ1, PMID:11018052] comment: Category=organism-sequence. synonym: "mAGRN/iso:m1" EXACT PRO-short-label [PRO:DNx] synonym: "agrin isoform TM-agrin (mouse)" EXACT [UniProtKB:A2ASQ1] synonym: "SN-agrin (mouse)" EXACT [PRO:DAN, PMID:11018052] synonym: "agrin isoform Transmembrane agrin (mouse)" EXACT [UniProtKB:A2ASQ1] xref: UniProtKB:A2ASQ1-1 is_a: PR:A2ASQ1 ! agrin (mouse) relationship: only_in_taxon NCBITaxon:10090 ! Mus musculus

Original comment by: nataled