monarch-initiative / pheval

A framework for empirical evaluation of phenotype matching and prioritisation
https://monarch-initiative.github.io/pheval/
Apache License 2.0
12 stars 1 forks source link

Add handling of chromosome naming in the phenopacket #307

Closed yaseminbridges closed 5 months ago

yaseminbridges commented 5 months ago

The new phenopackets in the phenopacket store define the chromosome position of the variant as either "chr1" or "1". Should implement handling in the codebase that removes the "chr" for proper matching to the results

matentzn commented 5 months ago

Should this not always be done the same way in phenopackets? Is this because the standard changed recently?

yaseminbridges commented 5 months ago

The phenopacket schema is pretty broad on this I think - it says it in the documentation that the chrom field can either be a chromosome or contig identifier. The original LIRICAL phenopackets just had it as 1-22, X & Y but the new ones in the phenopacket store are chr1-22, chrX & chrY. It will be pretty easy to handle this in the code

matentzn commented 5 months ago

Sure, but if the variation is arbitrary, i.e. lexical variation only, chr prefix or no, this seems an unecessary burden on phenopackets users. Maybe mention my concern to Jules, this has nothing to do with your work at all, you go ahead and do what you like to get PhEval to work! Thanks for listening!

yaseminbridges commented 5 months ago

I understand what you mean now, I will bring it up to Jules!