biolink / ontobio

python library for working with ontologies and ontology associations
https://ontobio.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
118 stars 30 forks source link

Converting GPAD to GAF can result in blank evidence codes if no ECO-to-GAF mapping exists #620

Closed dustine32 closed 2 years ago

dustine32 commented 2 years ago

Coming from https://github.com/geneontology/go-site/issues/1847 and potentially related to https://github.com/geneontology/pipeline/issues/283.

The GpadParser currently allows creation of GoAssociation objects with evidence ECO codes that do not directly map to a GAF code (e.g. IDA, ISO) so that, when writing these out to GAF, it will result in a blank evidence code column.

We should throw an error on any GPAD line that uses one of these unmapped ECO codes (e.g. ECO:0006003). A flag can be created to bypass this check but the default will be to fail these lines.

Tagging @kltm

kltm commented 2 years ago

For those reading along: this would generate an error in the report as well, and would not be silent (which is partially the issue now for some cases).

dustine32 commented 2 years ago

Testing in GO pipeline shows that the GPAD parse step is now correctly filtering out and reporting ECO codes that cannot be mapped to a GAF code.