MICommunity / psimi

Automatically exported from code.google.com/p/psimi
Creative Commons Attribution 4.0 International
5 stars 3 forks source link

Create PSI-MI TAB 2.6 implementation #2

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The new PSI-MI TAB 2.6 should extend the current 2.5 with this 16
additional columns:

PSI-MI TAB 2.6

- expansion -> CV (spoke, matrix, none, bipartite)
- biological role A
- biological role B
- experimental role A
- experimental role B
- interactor type A
- interactor type B
- xrefs A
- xrefs B
- xrefs Interaction
- Annotations A
- Annotations B
- Host organism taxid
- parameters Interaction
- dataset
- Caution Interaction

Original issue reported on code.google.com by brunoaranda on 3 Jun 2009 at 12:52

GoogleCodeExporter commented 9 years ago

Original comment by brunoaranda on 3 Jun 2009 at 12:54

GoogleCodeExporter commented 9 years ago
This column was missing and will be added after A and B:

- Annotations Interaction

Original comment by brunoaranda on 3 Jun 2009 at 1:05

GoogleCodeExporter commented 9 years ago
Ok I have starred this.  Ian

Original comment by ian.o...@gmail.com on 3 Jun 2009 at 9:05

GoogleCodeExporter commented 9 years ago
Hi Bruno 

I have started on our documentation of the extended MITAB format.  See

http://irefindex.uio.no/wiki/README_iRefIndex_expanded_MITAB_proposal

I have a few questions (search for Bruno).

First, can you remind me what xrefs A and xrefs B are meant for and how they 
differ
from columns 3 and 4 (alt A and altB).

A note to MINT, your documentation for the MITAB file is outdated: it still 
reflects
2.5.  See ftp://mint.bio.uniroma2.it/pub/release/MITAB/mitab-readme.txt

Ian

Original comment by ian.o...@gmail.com on 14 Jan 2010 at 6:35

GoogleCodeExporter commented 9 years ago
Hi,

The xrefs columns can be used to put GO identifiers, interpro etc. They define 
additional information about the molecules or interaction.

The alternative ID columns can be used to put identifiers of the molecules in 
other 
databases. I am not too happy about this field, because we could just use the 
ID 
columns for that and no need to separate logically the identifiers, but we have 
to 
maintain it for backward compatibility.

In addition, we still haven't implemented MITAB 2.6 (nor is MINT - we use the 
same 
code). It has been in our backlog for some time and I hope we can tackle this 
before 
the PSI spring meeting...

Cheers,

Bruno

Original comment by brunoaranda on 14 Jan 2010 at 10:23

GoogleCodeExporter commented 9 years ago
Ok.  If columns 24-25 (xref A and xref B) are for annotations on A and B (GO,
interpro etc.) what are columns 26-27 for (Annotations A and Annotations B)?

It looks like there is also column 28 that should be Annotations Interaction 
(comment
2).  Right?  So the full list of new columns is

16- expansion -> CV (spoke, matrix, none, bipartite)
17- biological role A
18- biological role B
19- experimental role A
20- experimental role B
21- interactor type A
22- interactor type B
23- xrefs A
24- xrefs B
25- xrefs Interaction
26- Annotations A
27- Annotations B
28- Annotations Interaction
29- Host organism taxid
30- parameters Interaction
31- dataset
32- Caution Interaction

Original comment by ian.o...@gmail.com on 20 Jan 2010 at 9:32

GoogleCodeExporter commented 9 years ago
Hi,

We definitely need to add this information to
http://code.google.com/p/psimi/wiki/PsimiTabFormat at some point.

In theory, annotations would be free text comments or additional information 
about A,
B or, as you propose, the interaction too.

I see the names xref/annotations can be confusing. Maybe someone can think of 
better
names for this?

Xrefs: GO terms, interpro, etc...
Annotations: additional text information.

Original comment by brunoaranda on 22 Jan 2010 at 1:09

GoogleCodeExporter commented 9 years ago
Do you or Sandra have any examples for what might go into column 
30 (parameters Interaction)?  

You are free to use any of the text I have written to describe MITAB at
http://donaldson.uio.no/wiki/README_iRefIndex_expanded_MITAB_proposal

I am still working my way through questions I have on the format.  Would you 
guys
like to have a skype call to discuss this a little faster?

Original comment by ian.o...@gmail.com on 22 Jan 2010 at 1:46

GoogleCodeExporter commented 9 years ago
I am adding the following text from email from psimex-bounces@ebi.ac.uk sent 
May 6 2010.  This description captures suggested changes to PSI MITAB be 
implemented in version 2.6.  I believe these changes were discussed at the 
HUPO-PSI Seoul meeting.
The new PSI-MITAB 2.6 format was then voted on by PSIMEx member institutes and 
was accepted May 19th, 2010.

Ian

#begin snippet

MITAB 2.6 Specificaction

This document specifies the version 2.6 of MITAB

Basically, the new version of the format is based in the addition of new 
columns to the standard set.

- expansion ->   CV term (spoke, matrix, none, bipartite) ->   e.g.
psi-mi:"MI:01234"(spoke)
- (*) biological role (A,B)
- (*) experimental role (A,B)
- (*) interactor type (A,B)
- (*) xrefs (A,B,I)
- (*) Annotations (A,B,I) - dataset:*|dataset:*|data-processing:*
- (*) Host organism taxid
- (*) parameters (I)
- Creation date (yyyy/MM/dd)
- Update date (yyyy/MM/dd)
- Checksum (A,B,I)
- (*)Negative:(true|false)

(*) Equivalent to its PSI-MI XML counterpart.

Checksum columns
For interactions (involving only proteins) then use 
rigid:3ERiFkUFsm7ZUHIRJTx8ZlHILRA Where calculation of the checksum is as 
described in PMID 18823568.

This key is presently undefined for interactions involving a mixture of 
proteins and other interactor types (like small molecules).

If small molecules, use inchi strings (e.g.

inchi-string:"C(CC(C)(C)(C)(C))"

Metadata
In the header section of the file, metadata lines must be used (e.g.
to contain additional information of the file or the hge

Use '#' (the hash symbol) as the first character of these lines. The parser 
will ignore any line starting with #.

Original comment by ian.o...@gmail.com on 10 Oct 2010 at 8:43