Open lhannest opened 6 years ago
hmm definitely undesirable. Can you try running the service locally and report what the solr calls are (on the log)
This reminds me that @tudorgroza emailed the same issue and I meant to turn it into a ticket, from Tudor:
" the search is for some reason 'stateful'. If I search for 'disease', all subsequent calls will return only disease, even if the category is not specified. If I change the category to 'gene', then again, all subsequent calls will return only genes. "
http://localhost:5000/api/search/entity/diabetes?rows=1&start=1
2018-02-27 15:49:48,791 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 15:49:48,791 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': [], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 15:49:49,054 - root - INFO - Docs found: 290
2018-02-27 15:49:49,055 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 15:49:49] "GET /api/search/entity/diabetes?rows=1&start=1 HTTP/1.1" 200 -
The entity returned is HP:0005978
with categoryPhenotype
.
http://localhost:5000/api/search/entity/diabetes?rows=1&start=1&category=gene
2018-02-27 16:07:37,428 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 16:07:37,428 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': ['category:"gene"'], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 16:07:37,662 - root - INFO - Docs found: 44
2018-02-27 16:07:37,663 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 16:07:37] "GET /api/search/entity/diabetes?rows=1&start=1&category=gene HTTP/1.1" 200 -
The entity returned is MGI:99415
with category gene
.
http://localhost:5000/api/search/entity/diabetes?rows=1&start=1
2018-02-27 16:09:05,565 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 16:09:05,566 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': ['category:"gene"'], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 16:09:05,851 - root - INFO - Docs found: 44
2018-02-27 16:09:05,852 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 16:09:05] "GET /api/search/entity/diabetes?rows=1&start=1 HTTP/1.1" 200 -
The entity returned is MGI:99415
with category gene
.
http://localhost:5000/api/search/entity/diabetes?rows=1&start=1&category=disease
2018-02-27 16:10:42,990 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 16:10:42,991 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': ['category:"disease"'], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 16:10:43,400 - root - INFO - Docs found: 217
2018-02-27 16:10:43,406 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 16:10:43] "GET /api/search/entity/diabetes?rows=1&start=1&category=disease HTTP/1.1" 200 -
The entity returned is MONDO:0005148
with category disease
.
http://localhost:5000/api/search/entity/diabetes?rows=1&start=1
2018-02-27 16:12:07,989 - root - INFO - Using pre-loaded object: <ontobio.config.Config object at 0x7f98d6bac2e8>
2018-02-27 16:12:07,989 - root - INFO - PARAMS={'qt': 'standard', 'rows': 1, 'hl.simple.pre': '<em class="hilite">', 'facet.field': ['category', 'taxon_label'], 'hl': 'on', 'fq': ['category:"disease"'], 'hl.snippets': '1000', 'start': 1, 'facet.mincount': 1, 'facet.limit': 25, 'qf': ['iri_std^3', 'iri_kw^3', 'iri_eng^3', 'synonym_std^2', 'synonym_kw^2', 'synonym_eng^2', 'label_std^2', 'label_kw^2', 'label_eng^2', 'id_std^3', 'id_kw^3', 'id_eng^3', 'definition_std^2', 'definition_kw^2', 'definition_eng^2'], 'facet': 'on', 'fl': '*,score', 'defType': 'edismax', 'q': 'diabetes'}
2018-02-27 16:12:13,232 - root - INFO - Docs found: 217
2018-02-27 16:12:13,235 - werkzeug - INFO - 127.0.0.1 - - [27/Feb/2018 16:12:13] "GET /api/search/entity/diabetes?rows=1&start=1 HTTP/1.1" 200 -
The entity returned is MONDO:0005148
with category disease
.
Yeah, it looks like the filter query is persisting somehow.
This is odd, I'm stepping through the code in entitysearch.py:
@ns.route('/entity/<term>')
@api.doc(params={'term': 'search string, e.g. shh, parkinson, femur'})
class SearchEntities(Resource):
@api.expect(simple_parser)
#@api.marshal_list_with(search_result)
def get(self, term):
"""
Returns list of matching concepts or entities using lexical search
"""
import pudb; pudb.set_trace()
args = simple_parser.parse_args()
q = GolrSearchQuery(term,
**args)
results = q.exec()
return results
PuDB output:
>>> args
{'rows': 1, 'category': None, 'start': 1}
>>> term
'diabetes'
But when stepping into the constructor, PuDB output:
>>> fq
{'category': ['disease']}
ok, this is at the ontobio level
>>> p = {'category':'disease'}
>>> q = GolrSearchQuery('diabetes', **p)
>>> results = q.exec()
>>> q.fq
{'category': 'disease'}
>>> p = {}
>>> q = GolrSearchQuery('diabetes', **p)
>>> results = q.exec()
>>> q.fq
{'category': 'disease'}
class GolrSearchQuery(GolrAbstractQuery):
"""
Queries over a search document
"""
def __init__(self,
term=None,
[snip]
fq={},
I assumed this makes a fresh empty dict each time but apparently not?
"Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well."
http://docs.python-guide.org/en/latest/writing/gotchas/
I wouldn't have expected that!
😱
Today I Learned! 🐍
The category filter is sticky, and I assume others are as well. Maybe data is being re-ordered each time a query is run? For example if you do these queries in this order:
https://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1
This returns a diseasehttps://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1&category=gene
This appropriately returns a gene.https://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1
This now returns a gene.https://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1&category=disease
This returns a diseasehttps://api.monarchinitiative.org/api/search/entity/diabetes?rows=1&start=1
This now returns a disease.This seems like a bug to me. The same query parameters should return the same data.