Closed huaiyumi closed 3 years ago
Multiple qualifiers are supported in both GAF 2.1 and 2.2 and separated by |
characters. Just need to ensure the createGAF.pl script is emitting NOT|contributes_to
for these example IRDs.
@huaiyumi With this fix, the resulting IBD file and your examples in particular look much better:
PANTHER PTN000806051 PTN000806051 NOT|contributes_to GO:0008121 PMID:21873635 IRD PANTHER:PTN002228081 F protein taxon:10228 20191016 GO_Central
PANTHER PTN000806047 PTN000806047 NOT|contributes_to GO:0008121 PMID:21873635 IRD PANTHER:PTN002228081 F protein taxon:684364 20191016 GO_Central
I sent you the full IBD file for review.
Looks good.
I noticed a few annotations with IRD evidence code but no NOT qualifier in the IBD file. After going through some of the trees, I think the problem is caused by multiple qualifiers on those nodes. The IBD gaf only takes one qualifier randomly. Here are two examples:
PANTHER PTN000806051 PTN000806051 contributes_to GO:0008121 PMID:21873635 IRD PANTHER:PTN002228081 F protein taxon:10228 20191016 GO_Central PANTHER PTN000806047 PTN000806047 NOT GO:0008121 PMID:21873635 IRD PANTHER:PTN002228081 F protein taxon:684364 20191016 GO_Central PTHR10134:AN294
Both nodes have both NOT and _contributesto, but one was randomly used in the IBD file for the node. Here are a few families that miss the NOT qualifier: PTHR10134 PTHR10221 PTHR11361 PTHR12604
There could be other families that miss the _contributesto qualifier.