alex-ong / NESTrisOCR

OCR for statistics in NESTris
24 stars 7 forks source link

Fix multiprocessing for MacOS #12

Closed alex-ong closed 4 years ago

alex-ong commented 4 years ago

In MacOS 10.14 and beyond, using multiprocessing can cause it to (silently / non-silently) crash.

This has obvious performance implications. As a workaround, you can use MULTI_THREAD = 1

Two things i've attempted are : 1) multiprocessing.set_start_method() 2) environment var OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES

Apparently using numpy with multiprocessing can also cause issues.

alex-ong commented 4 years ago

As a band-aid, the calibrator which uses 000000/000/00 for Score/Lines/Level could be run in a single thread by checking os != win32

alex-ong commented 4 years ago

There might be a solution in sight: Python 3.8: on macOS, the spawn start method is now used by default in multiprocessing

alex-ong commented 4 years ago

We could potentially remove multithreading now.

Performance numbers (Ryzen 2700x): Scan numbers,line,field,preview, field_stat

DIRECT_CAPTURE
Single thread: 7-12ms
4 Thread: 2-5ms
WINDOW_N_SLICE
Single thread: 4-8ms
4 thread:16-21ms

Note that WINDOW_N_SLICE brings the benefit of no race conditions, so single thread window_n_slice seems like a reasonable compromise. We still should test on some weak ass cpu's though.

multithreading and window_n_slice are incompatible due to pickling images being way too expensive.

For now i will ensure that if WINDOW_N_SLICE is picked, multi-thread is forced to one.

alex-ong commented 4 years ago

Multithread will be deprecated soon since singlethread performance is more than sufficient with current optimizations.