phenopackets / phenopacket-tools

An app and library for building, conversion, and validation of GA4GH Phenopackets.
http://phenopackets.org/phenopacket-tools/stable/
GNU General Public License v3.0
10 stars 5 forks source link

Improve format and element sniffing #171

Closed ielis closed 1 year ago

ielis commented 1 year ago

Improve format sniffing and implement element sniffing for JSON and YAML.

The PR adds a bug fix for the simple format sniffing. The format sniffing algorithm first checks if the input looks like YAML or JSON file and falls back to Protobuf if not.

The element sniffing works for JSON and YAML formats. In JSON, the algorithm determines the element using discriminatory top-level fields; the field names that are unique to Phenopacket (e.g. subject), Family (e.g. pedigree) and Cohort (e.g. members). An exception is thrown if the available content does not contain a discriminatory field or in presence of fields from different top-level components.

Element sniffing does not work for Protobuf at the moment.