biothings / biothings.species

This implements species API as part of BioThings API
Apache License 2.0
0 stars 3 forks source link

Behavior of `has_gene` parameter #4

Closed zcqian closed 3 years ago

zcqian commented 3 years ago

Current implementation: https://github.com/biothings/biothings.species/blob/01c7d89c002a9adfb392c351478500a1a2700a29/src/web/pipeline.py#L32-L33

Given the implementation, when has_gene is:

My particular question is when has_gene is set to False. Judging by the line beneath it (Line 33), I think it's meant to query documents with has_gene==False, while right now it simply ignores it.

Which is the expected behavior, @newgene ?

newgene commented 3 years ago

has_gene field has only two states: True or missing, so False is effectively the same as "unspecified".

zcqian commented 3 years ago

Which field do you mean, in the document or in the request?

The document can definitely have has_gene set to false, as shown here: https://github.com/biothings/biothings.species/blob/acf2fca655427996380446e57350605c54a9fae8/src/hub/databuild/mapper.py#L23-L29

When sending a request you can set has_gene to 0 using GET, or use JSON in a POST request and explicitly set it to false.

newgene commented 3 years ago

I see, I did not know we still set has_gene to False in the document. That's fine too, in this case, let's include the possibility to query has_gene==False (or 0 on GET) to match only "false" value.

zcqian commented 3 years ago

after some research I realized this field is always set when processing in the query pipeline, therefore we cannot filter for has_gene==false

The related tests have been removed and a comment has been added in https://github.com/biothings/biothings.species/commit/77733e178e52277b3d2394e80ed5c133f2929c19