DanBloomberg / leptonica

Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The official github repository for Leptonica is: danbloomberg/leptonica. See leptonica.org for more documentation.
Other
1.76k stars 387 forks source link

Use atomic operations for reference counters #615

Closed stweil closed 2 years ago

stweil commented 2 years ago

Signed-off-by: Stefan Weil sw@weilnetz.de

DanBloomberg commented 2 years ago

I see you're no longer using atomic_fetch_add and atomic_fetch_sub. Were they not necessary for atomic operations on the ref counts?

stweil commented 2 years ago

Yes, it is sufficient to use an atomic data type for refcount. The following code snippet shows that both variants create the same assembler code which uses lock to be thread-safe:

#include <stdatomic.h>

atomic_int acnt;

int g_add(void)
{
    return acnt++;
}

int g_fetch(void)
{
    return atomic_fetch_add(&acnt, 1);
}

0000000000001260 <g_add>:
    1260:       b8 01 00 00 00          mov    $0x1,%eax
    1265:       f0 0f c1 05 e3 2d 00    lock xadd %eax,0x2de3(%rip)        # 4050 <acnt>
    126c:       00 
    126d:       c3                      retq   
    126e:       66 90                   xchg   %ax,%ax

0000000000001270 <g_fetch>:
    1270:       b8 01 00 00 00          mov    $0x1,%eax
    1275:       f0 0f c1 05 d3 2d 00    lock xadd %eax,0x2dd3(%rip)        # 4050 <acnt>
    127c:       00 
    127d:       c3                      retq   
    127e:       66 90                   xchg   %ax,%ax
DanBloomberg commented 2 years ago

Thank you for doing this, and also for patiently explaining/showing the details. It's a clever implementation in stdatomic.h that the simple use of the atomic_int type causes locking.

And your conditional typedef'ing for non-C11 compliant compilers (Visual Studio) is impressive.

DanBloomberg commented 2 years ago

Recompiled and ran alltests_reg against previous output. All 145 tests gave identical results.

stweil commented 2 years ago

Here is test code which shows that multithreaded code can fail with the previous code. This test would give a different result:

stweil@ocr-02:~$ cat atomic_test.c 
#include <stdio.h>
#include <threads.h>
#include <allheaders.h>

#define N 10000
#define THREADS 16

int f(void* thr_data)
{
    PIX *pix = thr_data;
    for(int n = 0; n < (N / THREADS); ++n) {
      PIX *pclone = pixClone(pix);
      pixDestroy(&pclone);
    }
    return 0;
}

int main(void)
{
    PIX *pix = pixCreate(100, 100, 8);
    thrd_t thr[THREADS];
    for(int n = 0; n < THREADS; ++n)
        thrd_create(&thr[n], f, pix);
    for(int n = 0; n < THREADS; ++n)
        thrd_join(thr[n], NULL);
    printf("refcout is %u\n", pix->refcount);
    pixDestroy(&pix);
    return 0;
}
stweil@ocr-02:~$ gcc -O2 -I/usr/include/leptonica atomic_test.c -llept  -lpthread
stweil@ocr-02:~$ ./a.out 
refcout is 5 (should be 1)
stweil@ocr-02:~$ ./a.out 
refcout is 2 (should be 1)
stweil@ocr-02:~$ ./a.out 
double free or corruption (!prev)
Aborted
stweil@ocr-02:~$ ./a.out 
refcout is 1 (should be 1)
stweil@ocr-02:~$ ./a.out 
refcout is 1 (should be 1)
stweil@ocr-02:~$ ./a.out 
refcout is 2 (should be 1)
stweil@ocr-02:~$ ./a.out 
refcout is 4 (should be 1)
DanBloomberg commented 2 years ago

This is a nice test thread-safe code. The bad results you show are for the previous version, without locking, and the new code doesn't fail, right?

stweil commented 2 years ago

That's correct. In many cases that test returns a correct refcount of 1 also with the old code (especially when only a few threads are used), but there always remains a certain risk. The code can even abort with a double free error.

The new code runs fine without those errors.

amitdo commented 1 year ago

Stefan, Dan

To support this feature with MSVC in C mode you can copy this file and use it in Leptonica:

https://github.com/videolan/dav1d/blob/8a4932ff035e2fd2b139caa73f6593c2a4754f67/include/compat/msvc/stdatomic.h

It has the same license as Leptonica (BSD 2-Clause).

stweil commented 1 year ago

Are there a lot of users of Leptonica builds with MSVC? And do they run it in applications with many threads? If not, I think we can simply ignore the missing atomic support for that special case and hope that Microsoft will implement it later.

amitdo commented 1 year ago

https://abseil.io/docs/cpp/atomic_danger

They're talking about C++ but maybe it's also relevant to C.

amitdo commented 1 year ago

and hope that Microsoft will implement it later.

https://developercommunity.visualstudio.com/t/Add-stdatomich-support-for-C11/1204057

amitdo commented 1 year ago

https://devblogs.microsoft.com/cppblog/c11-atomics-in-visual-studio-2022-version-17-5-preview-2/

DanBloomberg commented 1 year ago

Microsoft documentation says that Microsoft has more limitations on atomics with C++ than with C.

amitdo commented 1 year ago

They support C11/C17, so people can compile using the C compiler instead of the C++ compiler.

amitdo commented 1 year ago

BTW, I don't see that CMake or SW are activating /std:c11 or /std:c17.

https://learn.microsoft.com/en-us/cpp/build/reference/std-specify-language-standard-version?view=msvc-170

Also, why is CXX is added to LANGUAGES in project() in CMake?

@zdenop, @egorpugin

zdenop commented 1 year ago

It is there from beginning. AFAIR once I remove it locally and it caused some problems/errors - I do not remember details (=> needs to be tested, not just removed).

egorpugin commented 1 year ago

Also, why is CXX is added to LANGUAGES in project() in CMake?

For some reason leptonica was not able to build in C mode in VS.

BTW, I don't see that CMake or SW are activating /std:c11 or /std:c17.

I did not implement proper C modes for VS yet.

amitdo commented 1 year ago

@zdenop,

Please try to add /std:c17, remove CXX, build with MSVC and see if it works.

stweil commented 1 year ago

Ideally there would be a GitHub action which builds with MSVC.

zdenop commented 1 year ago

See https://github.com/DanBloomberg/leptonica/pull/697. GA Workflow added