pantherdb / fullgo_paint_update

Update of Panther and PAINT DBs with monthly GO release data
0 stars 0 forks source link

Place PANTHER:PTN as first value in IBA GAF with column #13

Closed dustine32 closed 6 years ago

dustine32 commented 6 years ago

From issue https://github.com/geneontology/go-annotation/issues/2055.

To facilitate easier parsing of PANTHER:PTN ID. Example:

PomBase SPAC110.01 ppk1 GO:0006468 PMID:21873635 IBA FB:FBgn0025625|MGI:MGI:104754|MGI:MGI:106924|MGI:MGI:1336173|MGI:MGI:1347559|MGI:MGI:1923020|MGI:MGI:2445031|MGI:MGI:2685946|PANTHER:PTN000678822|PomBase:SPAC57A10.02|PomBase:SPAC644.06c|PomBase:SPBC19C2.05|PomBase:SPBC4F6.06|PomBase:SPCC1020.10|PomBase:SPCC297.03|

This can probably be done in createGAF.pl.

dustine32 commented 6 years ago

The createGAF script is already doing this apparently, though not explicitly. Rejumbling appears to maybe be on the GO pipeline side but I'll wait to close until that's confirmed.

pgaudet commented 6 years ago

@kltm Is there another ticket we can link to ?

kltm commented 6 years ago

@pgaudet I'm not sure what other ticket you'd want to link to. As well, I'm not sure of the need for ordering PANTHER IDs to be first here. If one wanted to do something like put things in alphabetical ordering for easy comparison with tools (which we may already be doing in the GO pipeline? @dougli1sqrd ), but I'm not sure of the use case here.

dustine32 commented 6 years ago

Source ticket: https://github.com/geneontology/go-annotation/issues/2055

kltm commented 6 years ago

@dustine32 Thank you for the source ticket. Hm. Honestly, I'm not sure about special-casing PANTHER--it is one of many curation tools that have their annotations all mixed together in the final product. (For example, as all curation tools could make similar arguments.)

dustine32 commented 6 years ago

@kltm I don't think you'd need to special case PANTHER since in our source IBA GAF files we already do the "PANTHER-first" sorting. It's just that, from the examples that I've seen, it looks like the with/from column values somehow get jumbled up out of order in the GO pipeline. See this comment.

kltm commented 6 years ago

@dustine32 That's what I mean: we specifically put them into alphabetical order so that we have an absolute ordering to the refs. This makes it possible to do easy large-scale diffing of GAFs for minor changes as typically there is no semantic meaning to the order. Or to put it another way, we specifically change the order at our end to make a guarantee that we can use for analyzing GAFs across all inputs and upstream resources.

pgaudet commented 6 years ago

@valwood is this OK for you ? you can parse out the information you need ?

dustine32 commented 6 years ago

@kltm Ahhh... that's a practical excuse.

ValWood commented 6 years ago

Yes we didn't really need the reordering. We could manage the content. I really only reported this as an issue because I didn't think anyone would really want to display the many evidences. We can easily locate the PANTHER ID and use that, so for us it really won't be a problem.

kltm commented 6 years ago

@ValWood Thank you for your understanding--we've actually been using the current ordering we apply quite a lot. Cheers!