Open nathan-guo opened 5 years ago
And this operation is so slow, how to open multiprocess?
Thanks
It is slow, and there cannot be done much currently to get it faster. To be really fast, we would need a Tesseract with GPU support. The current CPU based training could be made faster by using float
instead of double
, but I am afraid that @noahmetzger does not have the time to implement that.
The crash is most probably a known problem. Run ulimit -c unlimited
before running the training. Then Linux will create a core dump for the segmentation fault, and you can examine that to get more information.
Thank you so much.
@nathan-guo Is the problem solved?
The crash problem is still not solved, so it can occur.
@nathan-guo Is the problem solved?
The crash problem is still not solved
We removed the training scripts from this repo.
What was the source of the crash? A bug in the bash script itself or in Tesseract C++ code. If it was the latter - do we have another open issue with the same bug?
Ubuntu 18.04 tesseract 4.1.0 leptonica-1.75.3 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX512BW Found AVX512F Found AVX2 Found AVX Found SSE
==================================================
================================================================= Loaded 93944/93944 lines (1-93944) of document /tmp/chi_tra-2019-08-28.CSP/chi_tra.Microsoft_JhengHei.exp0.lstmf src/training/tesstrain_utils.sh: line 72: 23573 Segmentation fault "${cmd}" "$@" 2>&1 23574 Done | tee -a ${LOG_FILE} ERROR: Program tesseract failed. Abort.
thank U sooo much.