We need a strategy for handling phenopackets that do not include minimal required data for genotype-phenotype correlation analysis.
For instance, a phenopacket PMID_36446582_Novara2017_P2 does not include CNV coordinates, and ends up having no parsable variant.
Other fallible actions include unknown HPO term as of the used ontology.
Requirements
errors are reported with phenopacket ID
ideally all errors are reported, not just the first one
phenopacket is not included in the analysis
Suggestion
define a new exception, e.g. GenoPhenoParseException(BaseException) which can be thrown by PatientCreator. This enhances the parsing workflow and informs the user that the process is fallible. The exception has fields for sample ID and a list of specific issues.
revise PhenopacketPatientCreator such that create_patient function keeps track of the issues, such as unparsable variant, invalid HPO term, ...
if we encounter any issues, we pack them into the exception and raise it to indicate a failure
We need a strategy for handling phenopackets that do not include minimal required data for genotype-phenotype correlation analysis.
For instance, a phenopacket PMID_36446582_Novara2017_P2 does not include CNV coordinates, and ends up having no parsable variant.
Other fallible actions include unknown HPO term as of the used ontology.
Requirements
Suggestion
GenoPhenoParseException(BaseException)
which can be thrown byPatientCreator
. This enhances the parsing workflow and informs the user that the process is fallible. The exception has fields for sample ID and a list of specific issues.PhenopacketPatientCreator
such thatcreate_patient
function keeps track of the issues, such as unparsable variant, invalid HPO term, ...