utwente-fmt / ltsmin

The LTSmin model checking toolset
http://ltsmin.utwente.nl
BSD 3-Clause "New" or "Revised" License
51 stars 30 forks source link

dlopen is not MT-safe => protect calls to RTdlopen #181

Closed yanntm closed 4 years ago

yanntm commented 4 years ago

Hi,

(sorry for rapid fire series of issues) So I've diagnosed a crash that happens sometimes on my test bench, it is due to lt_dlopen being called concurrently which is illegal https://www.gnu.org/software/libtool/manual/html_node/Thread-Safety-in-libltdl.html#Thread-Safety-in-libltdl

so this manual says explicitly you should lock around the call

According to this SO, https://stackoverflow.com/questions/8744608/dlopen-in-multithreaded-application-exit-with-trace-bpt-trap it seems to be a bad idea to dlopen several times anyway, they recommend to dlopen once (e.g. in main thread) then pass the handle around.

An error trace shows : pins2lts-mc( 1/ 4), 0.000, ** error **: file not found, Library "./gal.so" is not reachable

Conclusion : This is one or more of the threads not being able to access the file (that exists) because of this contentious behavior (there are static elements in libltdl)

so we need to protect calls to https://github.com/utwente-fmt/ltsmin/blob/031d920ca8c7703f4cfb94373b82d15960afe0b1/src/hre/hre_runtime.c#L29 or add a HRE protection in there.

Note that this behavior is obtained with a build of LTSmin that uses

#define SPEC_MT_SAFE 1

A production log from CI with this bug visible : https://travis-ci.org/yanntm/ITS-Tools-pnmcc/jobs/612492047#L3217-L3393

I'll try to submit a patch with an added lock, as that is not too intrusive, but I'd like Alfons @alaarman opinion on how best to patch, I'm unsure why each thread is loading the model separately in the first place in this scenario.

(phew, this was pretty hard to diagnose, ended using some export LD_DEBUG=file as suggested on some other SO)

yann