Open sverhoeven opened 7 years ago
The variants between haplotypes are repeated except the genotypes field.
For example
{ "haplotypes": [{ "accessions": [], "haplotype_id": "xxx", "sequence": "XXX", "variants": [{ "chrom": "chrX", "position": 1234, "genotypes": [{ "accession": "acc1", "genotype": "[1, 1]" }] }] }], "hierarchy": {} }
The variant is repeated for each haplotype, with each haploytype having a different genotypes value. We should pull the variant object into variants map in the root of the response. We could change this to:
variants
{ "haplotypes": [{ "accessions": [], "haplotype_id": "xxx", "sequence": "XXX", "variants": [{ "variant_id": "varXXX", "genotypes": [{ "accession": "acc1", "genotype": "[1, 1]" }] }] }], "hierarchy": {}, "variants": { "varXXX": { "chrom": "chrX", "position": 1234 } } }
This should reduce the JSON response in size and make the server side Python 2 JSON conversion faster.
A quick test for file size converting current json to new schema:
./haplotypes.orig.json 1647926 ./haplotypes.packed.json 724572
This should make the json encode quicker.
The variants between haplotypes are repeated except the genotypes field.
For example
The variant is repeated for each haplotype, with each haploytype having a different genotypes value. We should pull the variant object into
variants
map in the root of the response. We could change this to:This should reduce the JSON response in size and make the server side Python 2 JSON conversion faster.