monarch-initiative / phenopacket-store

Collection of phenopackets
https://monarch-initiative.github.io/phenopacket-store/
BSD 3-Clause "New" or "Revised" License
12 stars 4 forks source link

Wip/fixarch+tsv #73

Closed pnrobinson closed 5 months ago

pnrobinson commented 5 months ago

@tudorgroza Just tried this and it works perfectly. I am going to merge it so we can make the next release. One small thing we should add when there is time is to output a warning if we cannot extract the PMID (right now, it is coming from the externalReference element, and the LIRICAL phenopackets do not have this because it is missing in the script). Also, if we cannot find the gene we should probably output the name of the cohort (e.g., Jacobsen syndrome is a chromosomal cohort).

tudorgroza commented 5 months ago

@pnrobinson - That's great ... will add the 2 things you've mentioned. What should I put in the two allele columns of the TSV? (as currently, there's nothing)

pnrobinson commented 5 months ago

Many of the phenopackets have one or two mutations. Probably the easiest thing would be to add it to the PPacket class where the genomic interpretation is being ingested. If we find no mutation then we would leave the field empty (e.g. allele 2 is empty for dominant disease).

tudorgroza commented 5 months ago

Thank you. My question was - do you want variantInterpretation.expressions.value for hgvs.c or hgvs.g in the allele columns?

pnrobinson commented 5 months ago

If present hgvs.c. Second choice would be hgvc.g. Third choice would the description field. After that give up and have the field be blank

tudorgroza commented 5 months ago

Ok - Added the allele info in the same PR.