Closed philbudne closed 1 year ago
@philbudne I think you mentioned in a meeting that this is not actually problem, so we don't need to change this constant. Did I understand correctly, and if so, can we close this?
Yes, I think it's fine as-is for the moment.
Only time will tell how best to balance things, especially if the queues are backed up, or we start to backfill historical data.
As previously posted (on slack):
It look like the (single) parser worker process on tarbell is using (at times) up to 1855% CPU, with the load average (unsurprisingly) between 16 and 19. [tarbell has 32 cores]
capture from "top" displaying threads:
which shows 25 threads, presumably working on a single story?
I suspect this is
trafilatura
callingpy3langid
which uses numpy (dot operator?) which uses OpenBLAS (open implementation of Basic Linear Algebra Subprograms) which is multi-threaded.If this is the case, setting
OPENBLAS_NUM_THREADS=n
in the parser worker's environment would control how many threads are launched. So long as there are enough CPU cores available (for parser and anything else running on the same server), this isn't necessarily a problem, and limiting the number of threads should make each "parse" take longer, and, if lowered to 1, would almost certainly leave idle cores.Our CPUs may have hardware thread support, which may be disabled by default, since it's one of MANY exploits that can leak data between processes. It might be worth investigating whether enabling SMT/Hyperthreading, and disabling other kernel mitigations for data leak exploits might yield any benefits (since we're unlikely to be worried about data leakage exploits).
Looking at ramos; seems to have two cpu packages, each with 16 cores, each with two threads, AND it has 64 "processor" entries in /proc/cpuinfo, so SMT may be enabled.