varfish-org / mehari

VEP-like tool for sequence ontology and HGVS annotation of VCF files
MIT License
14 stars 1 forks source link

Genome builds have no unified format #408

Open gromdimon opened 3 months ago

gromdimon commented 3 months ago

Describe the bug Currently genome builds have different "names"/"labels" in different APIs. For example:

https://reev.cubi.bihealth.org/internal/proxy/mehari/genes/txs?hgncId=HGNC:4806&genomeBuild=GENOME_BUILD_GRCH37&pageSize=1000

https://reev.cubi.bihealth.org/internal/proxy/mehari/seqvars/csq?genome_release=grch37&chromosome=6&position=24302274&reference=T&alternative=C

To Reproduce N/A

Expected behavior There should be one standard for genome builds across the mehari APIs.

Additional context N/A

xiamaz commented 3 months ago

Issue is that currently the type definition for the seqvar/structvar endpoints are defined in rust, whereas the tx endpoint is directly defined in protobuf.

All exposed API endpoints should be defined consistently. Putting query definitions in protobuf might have the nice effect for allowing non-http APIs, but might be less easy to use in rust.

xiamaz commented 3 months ago

Currently prost is used to interface with protobuf defined structures, whereas everything else uses serde. Both are unfortunately not compatible.

As a general design, moving everything i/o facing into protobuf might make sense.

xiamaz commented 3 months ago

Action plan