tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
604 stars 178 forks source link

Disable OpenMP #259

Open bertsky opened 3 years ago

bertsky commented 3 years ago

By default tesstrain builds vanilla tesseract / lstmtraining, which IINM links against OpenMP.

I know @stweil argued repeatedly for disabling OpenMP for prediction in the mass production / batch scenario, e.g. here.

However, the case for training seems to be a different one: normally we want to get a single model on a lot of data as fast as possible.

In this comment @theraysmith presented measurements showing 3.5x speedup with 4 threads.

However, I cannot reproduce that. On the contrary: On a VM with (4 cores of) Intel Xeon Gold 5218 CPU @ 2.30GHz and a finetuning job with 500 lines, I see an 8x increase, despite using more than 300% CPU instead of just 100% when single-threaded. (Yes, I do get similar results in both cases.) That's a 24x worse utilization of operational resources!

(I have repeated that experiment on a finetuning job with 1200 lines, where multithreaded takes 2x as long.)

Also, there seems to be a significant difference between lstmtraining built with --disable-openmp on the one side and lstmtraining built with OpenMP, but run with OMP_THREAD_LIMIT=1: the latter is even worse than OpenMP running unconstrained. Also, it still takes more than one core (alternating between 100% CPU utilization mostly and 200% intermittently).

Can anyone confirm this? Has something changed significantly in Tesseract's threaded code base since Ray's time, or is this simply due to my virtualization environment?

Moreover, @stweil has already pointed out OpenMP prevents reproducible models. For all the effort that has already gone into the latter (replacing shuf etc, sorting files), why is OpenMP still in then?

Why don't we build with --disable-openmp by default, or at least set OMP_THREAD_LIMIT=1 in the lstmtraining recipes?

stweil commented 3 years ago

Can anyone confirm this?

Yes, I get similar results: OpenMP uses a lot of resources and increases the time required for training.

Note that training currently requires up to 4 threads even when Tesseract was built without OpenMP:

stweil commented 3 years ago

Why don't we build with --disable-openmp by default?

Because nobody changed that? I wonder whether we should remove all build instructions from the Makefile.

zdenop commented 3 years ago

cmake build disabled openmp by default for a long time...

bertsky commented 3 years ago

Note that training currently requires up to 4 threads even when Tesseract was built without OpenMP:

  • training (running continuously)
  • asynchronous loading of lstmf files for training (one thread per file, only during first epoch until all files are in memory)
  • evaluation (running temporarily)
  • asynchronous loading of lstmf files for evaluation (one thread per file, only until all files are in memory)

Interesting! So that's why I get occasional 200% CPU utilization.

Because nobody changed that? I wonder whether we should remove all build instructions from the Makefile.

I see. Yes, probably better to have all that in Tesseract's makefile only.

I'm not autotools afluent, so how do we change this?

https://github.com/tesseract-ocr/tesseract/blob/7a308edcb1fc7455008b531bc2a49de583d7b171/configure.ac#L227

cmake build disabled openmp by default for a long time...

Indeed, git annotate says this was from 2yrs ago, so I guess the respective statement in the release notes Jul 07 2019 - V4.1.0, stating OpenMP has been disabled by default, only applied to that build variant. Let's have autoconf catch up then!

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

amitdo commented 2 years ago

so I guess the respective statement in the release notes Jul 07 2019 - V4.1.0, stating OpenMP has been disabled by default, only applied to that build variant.

I updated the 4.1.0 release notes.

Disable OpenMP support by default. This was done in the the CMake build, but not in the Autotools build, where the OpenMP is still enabled by default

stweil commented 2 years ago

The Autotools build in release 5.0.0 still enabled OpenMP by default. Should that be changed? Is this a "bug fix" which can be done in release 5.0.1 (and maybe also in release 4.1.4)?