Regarding BacDive-to-media edges, a total of 32513 edges for NCBITaxon -> medium are ingested from BacDive.
However, there are 1609 unmatched BacDive taxa (vs NCBITaxon). In addition there are 257 bacdive:None media associations and this is some odd ingest artefact, perhaps a python case that needs to be caught. It is likely that the current ingest does not do NER against NCBITaxon but just uses BacDive NCBITaxon field when available.
Regarding BacDive-to-media edges, a total of 32513 edges for NCBITaxon -> medium are ingested from BacDive.
However, there are 1609 unmatched BacDive taxa (vs NCBITaxon). In addition there are 257 bacdive:None media associations and this is some odd ingest artefact, perhaps a python case that needs to be caught. It is likely that the current ingest does not do NER against NCBITaxon but just uses BacDive NCBITaxon field when available.
Bacdive ids failed NCBITaxon NER = 1609 urn:uuid:77bd5911-8a5c-470e-9f09-d419ab11b6c2 bacdive:8540 biolink:occurs_in mediadive.medium:645 BAO:0002924 Graph bacdive:8540
bacdive:None = 257 urn:uuid:e12a5c72-ee3f-4bc6-9c79-ee4ff1b9f410 bacdive:None biolink:occurs_in mediadive.medium:C66 BAO:0002924 Graph bacdive:None
The CHEBI manual annotation file for unmatched cases is - all cases involved lack of a synonym. Here: https://github.com/Knowledge-Graph-Hub/kg-microbe/blob/feba/kg_microbe/transform_utils/traits/chebi_manual_annotation.tsv
Example GO unmatched term files are attached.
go_ner_unmatched.txt