obophenotype / upheno-dev

Framework for the automated construction of uPheno 2.0
MIT License
5 stars 5 forks source link

Different sizes between HP-MP and HP-ZP mappings #30

Open liulizhi1996 opened 3 years ago

liulizhi1996 commented 3 years ago

I use AML to automatically generate ontology mappings between HPO and MPO & HPO and ZPO. However, I find the number of mappings is quite different. For HP-MP mappings, the size is over 2,000, while for HP-ZP mappings, the size is only 177. The size of mappings between HP and other phenotype ontology, like C.elegans Phenotype Ontology, Xenopus Phenotype Ontology, etc., is even smaller. The same situation also happens in https://data.monarchinitiative.org/upheno2/current/upheno-release/all/upheno_mapping_all.csv. Why is this happening? Why cannot the ontology alignment tool identify as many HP-ZP mappings as HP-MP mappings?

Abbr: HPO = Human Phenotype Ontology; MPO = Mammalian Phenotype Ontology; ZPO = Zebrafish Phenotype Ontology; AML = AgreementMakerLight (https://github.com/AgreementMakerLight/AML-Project)

matentzn commented 3 years ago

The very general answer is: because most modern mapping tools are very simplistic, and the labels between ZP and MP/HP are very very different. We are exploring the possibility of using machine learning to create mappings like that, but the truth is, that its not very easy.. This ticket however is very useful, thank you! The real mapping (all those classes that should be mapped to each other) is much higher than the numbers you are currently getting, and with a bit of domain knowledge and understanding of the ontology structure, you can get better mappings, but at the moment I can only tell you that we are working on it - there is no 100% great answer yet..

Maybe you can get something interesting from https://archive.monarchinitiative.org/latest/owlsim/ (the owlsim.cache file) in the meantime, but there will be quite a bit of noise.

I will come back to you when I know more!