biothings / mygeneset.info

Apache License 2.0
5 stars 3 forks source link

Always return arrays for fields that can have multiple results #29

Closed vincerubinetti closed 3 years ago

vincerubinetti commented 3 years ago

Humans, for example, return results like:

{
  genes: [
    { ensembl: 1231, uniprot: 1231 },
    { ensembl: 1232, uniprot: 1232 },
    { ensembl: 1233, uniprot: 1233 }
  ]
}

But some other species, like pig, return results like this:

{
  genes: {
    ensembl: [ 1231, 1232, 1233 ],
    uniprot: [ 1231, 1232, 1233 ],
  }
}

If the mygenset.info domain redirect was working, I'd point you to the current live app which now just shows the JSON string (instead of a nice comma separated list) of the genes since it's not in a predictable format.

ravila4 commented 3 years ago

@vincerubinetti Could it be that that geneset you found contains a single gene? Can you point me to some sample queries?

vincerubinetti commented 3 years ago

Here is the top result for sus scrofa:

{"ensemblgene":["ENSSSCG00055021842","ENSSSCG00070015035","ENSSSCG00005019733","ENSSSCG00035056558","ENSSSCG00045022344","ENSSSCG00030059954","ENSSSCG00040012811","ENSSSCG00000039780"],"mygene_id":"100511223","name":"reticulon 4 receptor like 1","ncbigene":"100511223","symbol":["RTN4RL1"]}

I was thinking that these are separate genes because there are separate ensembl ids, but I guess it is one genes.

So it seems like there is a pattern of the APIs of returning either a single result as a primitive or an array of results, rather than just always returning an array. I think it would be much better to consistently return the same type.

ravila4 commented 3 years ago

That's right, some genes can have multiple ensemble IDs, but they are in the minority. This issue has been discussed before in MyGene. See: https://github.com/biothings/mygene.info/issues/42.

For now, the solution if you want consistent data types for a particular value is to pass it to the "always_list" and/or "allow_null" parameters. For example: GET http://mygeneset.info/v1/query?species=pig&always_list=genes,genes.ensemblgene&allow_null=genes.ensemblgene

There's also a better solution in the works at: https://github.com/biothings/mygene.info/issues/52

vincerubinetti commented 3 years ago

Very good, I'll consider this tracked by https://github.com/biothings/mygene.info/issues/52