icgc-argo / song-search

Song Search - GQL microservice for searching maestro generated song indexes
GNU General Public License v3.0
0 stars 0 forks source link

🐛 Elasticsearch is complaining about queries trying to sort on `updated_at` even though analysis_centric mapping has this field #65

Closed jaserud closed 2 years ago

jaserud commented 2 years ago

Describe the bug

Steps To Reproduce

When donor-submission-aggregator tries to scrape multiple analyses with its complex queires, elastic search floods with errors like these:

{"type": "server", "timestamp": "2021-12-09T15:53:22,386Z", "level": "DEBUG", "component": "o.e.a.s.TransportSearchAction", "cluster.name": "workflow", "node.name": "workflow-es-data-0", "message": "[graphlog_error_warning][3], node[AghrzU6WQDO-rZTrDgqkWQ], [R], s[STARTED], a[id=KvwTXjrqTuuU6n3M68OEIg]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[analysis_centric, analysis_centric_1.1, analysis_centric_1.2, file_centric, file_centric_1.1, file_centric_1.2, graphlog_error_warning, graphlog_info_debug, task, task-20200923, workflow, workflow-20200923], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false, ignore_throttled=true], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=42, allowPartialSearchResults=true, localClusterAlias=null, getOrCreateAbsoluteStartMillis=-1, ccsMinimizeRoundtrips=true, source={\"query\":{\"bool\":{\"must\":[{\"term\":{\"analysis_id\":{\"value\":\"39c3f9ca-7aca-429d-83f9-ca7aca429d5b\",\"boost\":1.0}}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"sort\":[{\"updated_at\":{\"order\":\"asc\"}}],\"track_total_hits\":2147483647}}]", "cluster.uuid": "6nSVxZP7SiG5swH9g0AFjA", "node.id": "pn08RDXIQn6NGrdImk-mkw" ,

Look closely and we can see there is a query to all the indices

SearchRequest{searchType=QUERY_THEN_FETCH, indices=[analysis_centric, analysis_centric_1.1, analysis_centric_1.2, file_centric, file_centric_1.1, file_centric_1.2, graphlog_error_warning, graphlog_info_debug, task, task-20200923, workflow, workflow-20200923]...

The bug here is that multiple search queries aren't specifing the index name: https://github.com/icgc-argo/song-search/blob/93df481e42f505e5fcf4a8a2bf6e0671d2dce156/src/main/java/bio/overture/songsearch/repository/AnalysisRepository.java#L198

This is how it works for single search queries: https://github.com/icgc-argo/song-search/blob/93df481e42f505e5fcf4a8a2bf6e0671d2dce156/src/main/java/bio/overture/songsearch/repository/AnalysisRepository.java#L190

Expected behaviour

This should not be happening because song-search should only be quering analysis_centric or file_centric.

jaserud commented 2 years ago

This is fixed in 2.7.0 which is already release to prod.