CERT-Polska / mquery

YARA malware query accelerator (web frontend)
GNU Affero General Public License v3.0
408 stars 76 forks source link

Better handling of `dataset not found error` #204

Open msm-code opened 4 years ago

msm-code commented 4 years ago

Environment information

Reproduction Steps

Start database compacting. Run a query in just a right moment (ideallly a long running query)

Expected behaviour

The query completes, possibly without testing all datasets when some of them were compacted in the meantime (this should be counted as an error somewhere)

Actual behaviour the bug

Query ends with a failed status, without returning any results.

[28/05/2020 18:02:49][ERROR] Failed to execute task.
Traceback (most recent call last):
  File "/app/daemon.py", line 311, in __process_task
    self.__search_task(job)
  File "/app/daemon.py", line 99, in __search_task
    raise RuntimeError(result["error"])
RuntimeError: ursadb failed: Invalid dataset specified in query
msm-code commented 1 year ago

Moving to v1.4.0 since it's low priority (not easy to trigger it with normal usage)

msm-code commented 1 year ago

Still thinking about it (maybe it's not as easy as I've thought).

It's easy to cancel the whole processing (as we're doing now) It's easy to ignore this error But we should probably continue processing the query, and at the same time let the user know that some of the processed files are no more available? This does not seem to be easy, because we don't support "non-critical errors". The closest would be to increment the number of failed files, but we don't know how many files were in the dataset that just crashed.

My current idea: pass dataset size to query_ursadb, and increase the number of failed files when the dataset is missing.