Better handling of `dataset not found error`

msm-code commented 4 years ago

Environment information

Mquery version (from the /status page): 1.2.0
Ursadb version (from the /status page): 1.3.2+1125ee5
Installation method:
- [ ] Generic docker-compose
- [ ] Dev docker-compose
- [ ] Native (from source)
- [x] Other (please explain) k8s

Reproduction Steps

Start database compacting. Run a query in just a right moment (ideallly a long running query)

Expected behaviour

The query completes, possibly without testing all datasets when some of them were compacted in the meantime (this should be counted as an error somewhere)

Actual behaviour the bug

Query ends with a failed status, without returning any results.

[28/05/2020 18:02:49][ERROR] Failed to execute task.
Traceback (most recent call last):
  File "/app/daemon.py", line 311, in __process_task
    self.__search_task(job)
  File "/app/daemon.py", line 99, in __search_task
    raise RuntimeError(result["error"])
RuntimeError: ursadb failed: Invalid dataset specified in query

msm-code commented 1 year ago

Moving to v1.4.0 since it's low priority (not easy to trigger it with normal usage)

msm-code commented 1 year ago

Still thinking about it (maybe it's not as easy as I've thought).

It's easy to cancel the whole processing (as we're doing now) It's easy to ignore this error But we should probably continue processing the query, and at the same time let the user know that some of the processed files are no more available? This does not seem to be easy, because we don't support "non-critical errors". The closest would be to increment the number of failed files, but we don't know how many files were in the dataset that just crashed.

My current idea: pass dataset size to query_ursadb, and increase the number of failed files when the dataset is missing.

CERT-Polska / mquery

Better handling of `dataset not found error` #204