phenopackets / phenopacket-format

26 stars 10 forks source link

Define mapping from PomBase PAF #67

Open cmungall opened 8 years ago

cmungall commented 8 years ago

http://www.pombase.org/submit-data/phenotype-data-bulk-upload-format

This is quite flexible and could become an official TSV format for PXF (although the json/yaml will still be canonical)

cc @ValWood

ValWood commented 8 years ago

One problem with this format..... We have been able to capture everything for single gene but it doesn't handle multi-gene alleles (if we want to capture all of the contributing gene IDs)

So we have another file for this soon: https://github.com/pombase/pombase-chado/issues/496

So far none of our users have requested to download the multi-gene phenotypes file...most are interested in single gene phenotypes so there might be advantages to having in separate files.

Any other suggestions welcome.

@mah11

cmungall commented 8 years ago

The way that would be most aligned with phenopackets is to make an entity identifier for the genotype object, @mbrush can give you some examples

EDIT ok, so that wouldn't quite work with PAF as it already hardwires certain fields to be gene, allele etc, it's kind of a PAF and a GtAF

ValWood commented 8 years ago

can we have a call about this when Midori is back.

We can do multiple formats. We need to maintain a simple one per line gaf style format for single-gene phenotypes for our biological end users who like excel... otherwise flexible

ValWood commented 8 years ago

File also for AmiGO loading :)

cmungall commented 8 years ago

Note WB are doing something similar, but with GAFs: https://github.com/WormBase/website/issues/1778

ValWood commented 8 years ago

Note that we are only trying to create a GAF for AMiGO loading, https://github.com/pombase/pombase-chado/issues/543 but this will not have all of the information or there is no exact equivalent

It would be useful if we could come up with a generic phenotype exchange format which would capture everything that is required (for biological end users, and for bulk submissions).

I had a quick look and don't fully understand all of the WormBase fields and I guess any format needs to be able to cope with precomposed phenotype ontology terms like ours and post composed EQ stuff like WormBase.

Would it be useful to have a discussion with you @khowe @kyook @mah11 and other interested to see if we can come up with a common GAF equivalent (we call our PHAF http://www.pombase.org/downloads/phenotype-annotations)

kyook commented 8 years ago

Hi All,

I've been looking at our phenotype_annotations GAF with the goal of updating it with attributions that are missing, such as anatomy, molecule, lifestage, etc. I've already asked Chris Mungall about the missing information and he suggests these values go into column 16, but there is a problem with missing gorelations information; we don't have those for phenotype.

So, it would be very useful to have a discussion about changes we should make to the file. Count me in!

Karen

Karen Yook

Curator
WormBase Caltech
Tel: 415.306.4150
e-mail: kyook@caltech.edu
e-mail: karen@wormbase.org
skype name: wbkaren

On Thu, Jun 9, 2016 at 4:48 AM, Val Wood <notifications@github.com> wrote:

> Note that we are only trying to create a GAF for AMiGO loading,
> pombase/pombase-chado#543
> <https://github.com/pombase/pombase-chado/issues/543>
> but this will not have all of the information or there is no exact
> equivalent
>
> It would be useful if we could come up with a generic phenotype exchange
> format which would capture everything that is required (for biological end
> users, and for bulk submissions).
>
> I had a quick look and don't fully understand all of the WormBase fields
> and I guess any format needs to be able to cope with precomposed phenotype
> ontology terms like ours and post composed EQ stuff like WormBase.
>
> Would it be useful to have a discussion with you @khowe
> <https://github.com/khowe> @kyook <https://github.com/kyook> @mah11
> <https://github.com/mah11> and other interested to see if we can come up
> with a common GAF equivalent
> (we call our PHAF http://www.pombase.org/downloads/phenotype-annotations)
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/phenopackets/phenopacket-format/issues/67#issuecomment-224872283>,
> or mute the thread
> <https://github.com/notifications/unsubscribe/AAlVGrWYVE_0rjr6tyrmFZLo8dScsTpVks5qJ_2fgaJpZM4Ik7Gt>
> .
>