Open cottsay opened 5 years ago
I've looked into this but i'm afraid i'll need a little more context to determine what's going on. Can you provide a stack trace or a description of how to reproduce the issue? Did you build OpenSplice yourself or download a prebuilt installer (32- or 64-bit?). Is there any notable difference between nodes that terminate with this issue and those that don't? Can you also please check your user limits for sufficient number of processes/threads and stack-size (ulimit -a -H
).
The minimum stack-size on Linux is 16KiB (PTHREAD_STACK_MIN
), at least on regular 'desktop' distro's i've seen including Fedora (I imagine an embedded distro or non-glibc pthreads implementation may have other defaults). Either way, the default for OpenSplice threads is set to 64KiB which should be fine but if it's not, EINVAL
should be returned by pthread_attr_setstacksize
not pthread_create
. If it gets to pthread_create
, the issue might not be related to stack-size, but one of the other attributes. However if you set OSPL_ENV_PURIFY
the default stack-size is raised to 10MiB so if that works for you, it must clearly be related to stack-size in some way I don't yet comprehend ;-).
When using OpenSplice on Fedora Linux with ROS, some nodes terminate citing EINVAL from
pthread_create
inposix/code/os_thread.c
.I reproduced this issue with Fedora 28, 29, and 30, in combinations with ROS Crystal and Dashing.
I noticed that the requested stack size was unusually small (64k) when the failure occurred.
linux/code/os_thread_attr.c
states that the OS should be increasing the stack as necessary, but this doesn't appear to be happening. Setting-DOSPL_ENV_PURIFY
worked around the issue for me.Do you have any guidance for investigating this further? I'd prefer not to set an extra flag that I don't fully understand.