els0r / goProbe

High-performance IP packet metadata aggregation and efficient storage and querying of flows
GNU General Public License v2.0
12 stars 4 forks source link

goQuery does not (reliably) respond to Ctrl-C #273

Closed fako1024 closed 8 months ago

fako1024 commented 8 months ago

It seems that goQuery doesn't honor (or at least doesn't always honor) a Ctrl-C (or a SIGTERM for that matter). Only a SIGKILL seems to ensure its demise. It would be better / expected if any running query can be interrupted cleanly at any point in time.

fako1024 commented 8 months ago

It seems that the signal is seen / processed (added some debug output):

./goQuery -d /tmp/db_out2_old  -n 10 -f -10000d -i any sip,dip,dport,proto
^Cgot cancel in workload
got cancel in workload
got cancel in workload
got cancel in workload
Error running query: failed to print query result: query cancelled before fully filled. 0/10 rows processed

I think I know what's going on: The context cancel is only checked for at the top loop for each worker (so it has to finish an individual directory, a.k.a day before the cancellation is observed). Usually this is enough, but in cases where there's a lot of data (which are unfortunately also the cases where you might have / want to cancel the query) it might just be too slow... :sweat:

fako1024 commented 8 months ago

Ah, now I get it, this is an unwanted side effect of the bulking we / I added some time in the recent past:

    // WorkBulkSize denotes the per-worker bulk size (number of GPDirs processed before
    // transmitting the resulting map to for further reduction / aggregtion
    WorkBulkSize = 32

So the problem isn't per se that it's only evaluated at top level, it's just that there is now an additional layer of looping in between, amplifying the time it takes to cancel by a factor of 32 (because it might look at 32 directories per worker until all of them are actually cancelled). That's an easy fix...