FairRootGroup / FairRoot

C++ simulation, reconstruction and analysis framework for particle physics experiments
http://fairroot.gsi.de
Other
59 stars 96 forks source link

Problem with libFairMQDDSConfigPlugin ? #481

Closed aphecetche closed 7 years ago

aphecetche commented 7 years ago

Hi,

Trying to use the DDS control and config plugins (like in 3-dds example), I get a crash of my device

terminate called without an active exception
./mch-digit-fake-generator.sh: line 6:  1924 Aborted                 (core dumped) alienv -w /home/dds/alicesw/run3/sw setenv O2/latest -c mch-digit-fake-generator --id mch-digit-fake-generator --control libFairMQDDSControlPlugin.so --config libFairMQDDSConfigPlugin.so --mq-config /o2control/mch-digit.json

I've narrowed it down to the initialisation of the config plugin in tools/runSimpleMQStateMachine.h, as far as I can tell. Looks like the fairmqControlPluginptr is null, but the dlerror is not.

Is there already a unit test or something alike to check just the loading/init of the plugin(s) ? If not, how one should go to contribute such a test (yes, I'm willing to give it a try ;-) ) ? Are there some docs on how to contribute some tests to fairmq (or FairRoot in general) ?

I'm working on CentOS7 (docker container).

Thanks,

rbx commented 7 years ago

Hi,

looking at the code i don't see yet what goes wrong. dlerror should cover both dlopen and dlsym.. The control plugin is loaded after the config one, can you see if the config plugin succeeded? Was FairRoot built with DDS (that would build the plugins)? Are both plugins reachable in your path? I'll add a null check there, although I expect dlerror to have handled that, and my error message won't give much more detail in place of the crash. I hope I can reproduce this somehow.

Regarding the test, we don't have one for plugins yet, since DDS is the only one that we have right now. We are working on extending the plugin system, so that it is same for all configuration sources. It is likely that a lot will change there soon, so if you write a test it will get out of date quickly.

In general, for an example of our tests, you can take a look at, e.g. https://github.com/FairRootGroup/FairRoot/blob/master/examples/simulation/Tutorial1/macros/CMakeLists.txt Perhaps Florian (@fuhlig1) can provide more info if necessary.

aphecetche commented 7 years ago

First, thanks for the quick answer.

Next, for your questions.

[O2/latest] /tmp/flp $> echo $LD_LIBRARY_PATH | tr ":" "\n"
/home/dds/alicesw/run3/sw/slc7_x86-64/O2/dev-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/FairRoot/dev-3/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/DDS/master-1.4-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/ZeroMQ/v4.1.5-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/sodium/v1.0.8-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/boost/v1.59.0_O2-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/protobuf/v3.0.2-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/GEANT4_VMC/v3-2-p1_O2-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/vgm/4.3_O2-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/GEANT4/v4.10.01.p03_O2-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/GEANT3/v2-2_O2-1/lib64
/home/dds/alicesw/run3/sw/slc7_x86-64/ROOT/v6-08-02_O2-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/Python-modules/1.0-1/lib
/home/dds/alicesw/run3/sw/slc7_x86-64/Python-modules/1.0-1/lib64

and the plugins are there :

[O2/latest] /tmp/flp $> ls -al /home/dds/alicesw/run3/sw/slc7_x86-64/FairRoot/dev-3/lib/*Plugin*
lrwxrwxrwx 1 dds dds      30 Feb  5 23:39 /home/dds/alicesw/run3/sw/slc7_x86-64/FairRoot/dev-3/lib/libFairMQDDSConfigPlugin.so -> libFairMQDDSConfigPlugin.so.16
lrwxrwxrwx 1 dds dds      36 Feb  5 23:39 /home/dds/alicesw/run3/sw/slc7_x86-64/FairRoot/dev-3/lib/libFairMQDDSConfigPlugin.so.16 -> libFairMQDDSConfigPlugin.so.16.06.00
-rwxr-xr-x 1 dds dds 1552688 Feb  5 22:07 /home/dds/alicesw/run3/sw/slc7_x86-64/FairRoot/dev-3/lib/libFairMQDDSConfigPlugin.so.16.06.00
lrwxrwxrwx 1 dds dds      31 Feb  5 23:39 /home/dds/alicesw/run3/sw/slc7_x86-64/FairRoot/dev-3/lib/libFairMQDDSControlPlugin.so -> libFairMQDDSControlPlugin.so.16
lrwxrwxrwx 1 dds dds      37 Feb  5 23:39 /home/dds/alicesw/run3/sw/slc7_x86-64/FairRoot/dev-3/lib/libFairMQDDSControlPlugin.so.16 -> libFairMQDDSControlPlugin.so.16.06.00
-rwxr-xr-x 1 dds dds 1173808 Feb  5 22:07 /home/dds/alicesw/run3/sw/slc7_x86-64/FairRoot/dev-3/lib/libFairMQDDSControlPlugin.so.16.06.00

OK for the tests. I've thus tried a very basic one :

#include <dlfcn.h>
#include <iostream>
int main(int argc, char** argv)
{
  void* handle =dlopen(argv[1],RTLD_LAZY);
  std::cout << "handle=" << handle << std::endl;
  std::cout << "error=" << dlerror() << std::endl;
  return 0;
}

and this one is actually able to load the plugin(s) just fine, so I'm a bit puzzled... (btw, is this share memory segment message normal ?)

[O2/latest] /tmp/flp $> /o2control/mch-dds-plugin-test libFairMQDDSControlPlugin.so
[11:04:50][INFO] Created/Opened shared memory segment of 2,000,000,000 bytes. Available are 1999999776 bytes.
handle=0x2065070
error=[11:04:50][INFO] Successfully removed shared memory after the device has stopped.
aphecetche commented 7 years ago

hum, wait... just dlopening is not the whole story of course...

hold on please, my first diagnostic is incorrect (was looking at the control pointer, which is obviously null at the point in the code where you are dealing with config ...).

The issue seems to be in the initConfig stage instead. Let me try to clarify this...

rbx commented 7 years ago

Regarding the shared memory message, it is always loaded, whether you use it or not. I fixed this in my working branch, and will include the fix in the next patches.

rbx commented 7 years ago

@aphecetche are you still experiencing this problem?

aphecetche commented 7 years ago

Hi Alexey,

My dev setup is not in a position to test this right now. If you want to close the issue, please do so, and I'll re-open if need be.

Regards,