Open ryandesign opened 7 months ago
The problem with PCRE2 in compilers that don't support c11 threads happens only with multi-threaded programs that use the LG library in different threads simultaneously. I think most users never do that, and since PCER2 is faster and better than the alternative libraries, it should remain the default.
If desired, configure
can warn that multithreading library use is not supported with PCRE2 on such systems.
Hi @ryandesign -- if you will allow me, I'd like to provide some deep history and general, opinionated twitter-thread commentary ...
pthreads
aka POSIX Threads dates back to about 1991. When it appeared, it was an important step forward, and was a technical masterpiece. Not without controversy: there were many opinions and arguments, with "green threads" being the most virulent and longest lasting.
The API was full, complete, well-designed and worked well. Most or "almost all" or literally "all"(?) other threading packages are built as a layer on top of POSIX threads. That's because its fast, optimized, and does everything. There's no mis-designed bulging crapola in there.
However, there are some common patterns of "things people typically do", and those are not in pthreads. They were common and predictable and regular enough that they became a part of the C and C++ language standards.
call_once
is typical: make sure you call some function once and only once, no matter what. In C11
its a one-liner. The pthreads variant would be 10-20 lines, with a good chance that the programmer introduces a hard-to-find, subtle bug. The pseudocode for it would be something like:
bool did_we_run_yet = false;
pthread_mutex lock;
pthread_init(lock);
void func_to_call_once() {
pthread_lock(lock)
if (did_we_run_yet) return;
did_we_run_yet = true;
pthread_unlock(lock)
rest of subr
}
The above could be added, with an #ifdef APPLE
around it. But damned if I'll add it: I've written this code snippet maybe 10 or 20 times in my life and I really really have no desire to reinvent it, again.
The C11
standard is freakin 13 years old. That's a long time. Apple s a big company. It should get off its fat lazy bum and implement it. Why hasn't it done so yet? Well, typical "I'm the final boss" behavior commonly seen in larger companies. Microsoft did this in spades, and became widely hated for it. They screwed up html, on purpose. Kerberos, on purpose, SSL, on purpose. Both the C and the C++ standards, on purpose. I stopped counting there. Eventually, they came around: but it took ten years. Microsoft eventually shipped standard versions of these systems, but only after much torture and trade-press abuse and developer complaints.
I've worked at large companies, and understand exactly how this dynamic happens. I could write a long long blog post on exactly how and why executives and managers make these decisions, and why the executives/managers think they're doing the right thing, and what it was that they failed to anticipate or understand about how the world works. Heh. Actually small companies do crap like this too, usually because they miscalculate how expensive it will be. Pisses off their customers, which is why small companies struggle and go bankrupt... but I digress.
The right thing to do is to write to Apple and complain about C11
support. Pester them. Get the developer support team to pay attention. Stuff does bubble up, eventually.
If you want to learn multi-threaded programming, above is a good starter project.
With the latest code in master,
make check
fails on macOS if link-grammar is built with pcre2 support. The multi-dict and multi-thread tests crash with a segmentation fault, which doesn't happen if I disable pcre2 with the--with-regexlib=c
configure argument. From the crash log, multi-dict crashed here:while multi-thread crashed here:
@ampli said in https://github.com/opencog/link-grammar/pull/1505#issuecomment-2073703909 that this is because:
My brief searching suggests that C11 threads (threads.h) are not well supported and pthreads is suggested as the recommended alternative. pthreads are already used elsewhere in the code:
https://github.com/opencog/link-grammar/blob/69c026f6eca72f4a276de69929be6f5a43059ad2/link-grammar/tokenize/spellcheck-hun.c#L22
Maybe using a single threading library for the entire code base would be a good idea. I can't help with that, however, as I haven't written any multithreaded code before.