veraPDF / veraPDF-library

Industry supported, open source PDF/A validation library
http://verapdf.org/software
GNU General Public License v3.0
277 stars 47 forks source link

JavaScriptEvaluator.getTestEvalResult blocks threads while concurrent validation #1408

Closed SalomScala closed 5 months ago

SalomScala commented 8 months ago

In our scenario we parallize validations with 20 Threads in a single JVM. We recognized a big bottleneck in the JavaScriptEvaluator.getTestEvalResult-Method, since it is synchronized and only one thread at a time can evaluate its result.

To show the problem I have attached

I have tried to fix it locally by adjusting JavaScriptEvaluator to use ThreadLocals. I got the following results:

Testscenario:

Result before optimization:

Result after optimization:

Thus an validation speed improvement of 325% in the concurrent case with 20 threads.

As far as I understand the JavaScriptEvaluator code, the change should be ok. All testcases work and the rhino javascript engine can execute scripts in parallel.

But maybe you have further insight, why the synchronisation was used? It got introduced in https://github.com/veraPDF/veraPDF-library/pull/960 because multithreading was not possible before. Now I would like to improve the performance for multithreading.

I have created pull request #1409

I welcome any feedback. Thanks.

bdoubrov commented 8 months ago

Thanks a lot, @SalomScala ! Looks like a great optimization indeed! We have enabled thread-safety of veraPDF only recently and there are still places that need improvements. We'll review and merge your PR if all goes well.

MaximPlusov commented 5 months ago

Included into release 1.26