Closed pkoper closed 5 years ago
I am not able to reproduce your variable output (instead I get a SIG-11 and a core file always with a debug build and the correct output always with a production build). But I noticed the call-in table for test1 specifies an input parameter but no such parameter is supplied in the actual ydb_ci call in prog.c.
callin.tab
-----------
test1: void test1^test(I:gtm_char_t*)
prog.c
-------
ydb_ci("test1");
Once I fix ydb_ci("test1")
to say ydb_ci("test1","abcd")
, I do not see the SIG-11 any more.
Can you try this out in your environment to see if your variable output also goes away with this change?
It would be still better to avoid the SIG-11 in the debug build in case the actual # of parameters specified differ from those specified in the call-in table but I will create a separate issue (on gitlab which is where we maintain our repository now) for that once you confirm the above.
In thinking about it some more, I realize I cannot create a separate issue if the # of actual parameters specified in ydb_ci() call differ from that specified in the call-in table. There is no way to know the actual # of parameters specified without changing the ydb_ci() interface. Additionally, there is already documentation that indicates the parameters and types should match (at https://docs.yottadb.com/ProgrammersGuide/extrout.html#call-in-interface).
Optional list of arguments to be passed to the M routine’s formallist: the number of arguments and the type of each argument must match the number of parameters, and parameter types specified in the corresponding Call-In table entry. All pointer arguments must be pre-allocated. YottaDB assumes that any pointer, which is passed for O/IO-parameter points to valid write-able memory.
We will document that ydb_ci() can cause SIG-11s or weird-symptoms/non-deterministic-output (like what you are seeing) if the correct # of parameters are not specified.
I have the same problem for ydb_ci("test1")
, ydb_ci("test1","abcd")
and void test1^test()
+ ydb_ci("test1")
. I'll try to replicate this on some virtual machine image. This problem is not connected with wrong number of arguments.
I think SIGSEGV on wrong number of arguments (or wrong types of arguments) for ydb_ci(p) is acceptable. Foreign Function Interfaces in other solutions work the same way. You can try to detect wrong number of arguments (or types) if you replace ydb_ci() with a macro with a sentinel:
#define SENTINEL 0x7077ADB
#define ydb_ci(name, ...) ydb_ci_sentinel(name, __VA_ARGS_, SENTINEL)
You know from callin.tab where to expect SENTINEL in ydb_ci_sentinel() arguments. If it isn't there, the wrong number of arguments was passed to ydb_ci(). (you should keep old ydb_ci() function for ABI backward compatibility)
Maybe it's a compiler issue - I have gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)
on Debian 9.5 (stretch).
I have reproduced the problem on Debian 9 image from osboxes.org (SourceForge download link) with YottaDB r1.22 Linux x86_64
.
Thanks. While downloading from the Sourceforge link, I get the below error. Any ideas how to work around this?
503 No Backend Servers Available No backend servers were available to answer your request. Please notify the site admins.
Maybe choosing another mirror will help you - this option is hidden behind "Problems downloading?" button.
The download started fine now. Maybe the site was down for a maintenance window previously. It is a 1GB file so it will take a few minutes to finish. Will keep you posted.
I am now logged into the VM. Is How do I run your test case steps? Do I install r122 and your test programs to verify this or are they pre-installed on this VM?
You have to install YottaDB and test programs on your own.
The exact instructions to reproduce the problem (with and without additional argument for ydb_ci("test1")).
# login as osboxes/osboxes.org
su - # osboxes.org
vi /etc/apt/sources.list # add: deb http://deb.debian.org/debian/ stretch main contrib
apt-get update
apt-get install make gcc
wget https://gitlab.com/YottaDB/DB/YDB/raw/master/sr_unix/ydbinstall.sh
chmod +x ydbinstall.sh
./ydbinstall --installdir /opt/ydb --utf8 default --force-install
logout # switch back to osboxes user
wget https://github.com/YottaDB/YDB/files/2728529/haisenbug.zip
unzip haisenbug.zip
. /opt/ydb/gtmprofile
make run
vi prog.c # change ydbci("test1"); to ydbci("test1","abcd");
make run
Thanks. Interestingly, I was able to get a SIG-11 today with the additional argument for ydb_ci("test1") with a debug build. I still am not able to get the UNDEF error in a pro build like you do but suspect the two are related so will first investigate this.
I have also reproduced the problem using YottaDB Docker container:
unzip haisenbug.tgz
vi prog.c # replace ydb_ci("test1"); with ydb_ci("test1","abcd");
docker pull yottadb/yottadb
docker run -it -v `pwd`:/data yottadb/yottadb
zsy "bash"
make run
My host system is Debian 9.5 (stretch)
Thanks. Will confirm the fix on the docker container once I get there. It is an issue with creating new variable names using the Simple API while inside an external call. I am still investigating but know enough to suggest a workaround. In your case, the "result" variable which is set inside the ydb_set_s() call is created for the very first time inside the ydb_set_s(). If you create the variable by say doing a set result="abcd"
before doing the external call (as shown below), I think you will avoid the UNDEF error. Can you confirm this.
call()
set result="abcd"
s rc=$&func()
If I remove ydb_ci("echo")
from prog.c
the problem disappears. If I create a single letter variable x
, the problem still exists:
call()
w "-- zwr shows nothing --",!
zwr
s x="abcd"
w "-- zwr shows x=""abcd"" --",!
zwr
s rc=$&func()
w "-- zwr shows rc=0 --",!
zwr
w "-- $g(rc) --",!
w "rc=",$g(rc),!
w "-- accessing rc --",!
w "rc=",rc,!
q
outputs:
-- zwr shows nothing --
-- zwr shows x="abcd" --
x="abcd"
-- zwr shows rc=0 --
rc=0
result="ok"
x="abcd"
-- $g(rc) --
rc=
-- accessing rc --
rc=Error occurred: 150373850,call+12^test,%YDB-E-LVUNDEF, Undefined local variable: rc
But if I replace set x="abcd"
with set y="abcd"
the problem disappears. You can create a variable with any longer name and the problem disappears. If you replace YDB_BUFFER_STR("result")
with YDB_BUFFER_STR("x")
in plugin.c
and use set x="abcd"
instead of set result="abcd"
the problem still exists. Therefore I don't think it would be a good workaround.
Not surprisingly, I am not able to reproduce the UNDEF with set x="abcd"
either.
That aside, there are two UNDEF errors happening in the first two ydb_ci() calls ("v" in test1 and "c" in echo). Those look like application errors to me.
test1(a)
s x=v
q
echo(a)
w "zyre=",$zyre,!
s b=c
q
Can you avoid those errors (or ensure they are handled by setting a $etrap to clear $ecode to "" etc.) and see if the problem still exists.
Thank you for the suggestion on exception handling. The haisenbug.tgz sample is based on a larger C+M code base, where the UNDEF problem is frequent and no M exceptions are raised (I believe). Please, take haisenbug.tgz only as a carefully prepared sample to reproduce the problem. Call-In+External Call scenario in my code base does not match the sample.
Sure. Will let you know when I have an update on the issue raised by your sample program.
I have a potential fix for your issue. Would you be able to build from source to verify it works?
Sure.
Great. Go to https://gitlab.com/YottaDB/DB/YDB/merge_requests/499. You can clik on the "Check out branch" link in that url and that should download the git repo for you. Let me know if you have any questions. Note that this fixes the SIG-11 I am seeing. I don't yet know if there are more issues since I don't yet see the UNDEF that you see.
Also can you try running your test with r1.22 after setting the ydb__dbglvl
(or gtmdbglvl
) environment variable to the value 64
(this turns on an extra memory verification step). I expect you to see a SIG-11 just like I do in the debug build. The UNDEF, in my understanding, is a symptom of how uninitialized memory (which was not correctly initialized in case of a hashtable expansion when a new M local variable name gets added while inside an external call) acts up differently on different systems.
I have compiled nars1/YDB-sig11 branch:
$ git branch
master
* nars1/YDB-sig11
$
But I still get UNDEF:
$ make run
gcc -Wall -o prog prog.c -fPIC -L/opt/ydb-351 -lyottadb -Wl,-rpath,/opt/ydb-351 -I/opt/ydb-351
gcc -Wall -shared -o plugin.so plugin.c -fPIC -L/opt/ydb-351 -lyottadb -I/opt/ydb-351
./prog
zyre=YottaDB r1.23 Linux x86_64
Error occurred: 150373850,echo+2^test,%YDB-E-LVUNDEF, Undefined local variable: c
Error occurred: 150373850,test1+1^test,%YDB-E-LVUNDEF, Undefined local variable: v
-- zwr shows rc=0 --
rc=0
result="ok"
-- $g(rc) --
rc=
-- accessing rc --
rc=Error occurred: 150373850,call+7^test,%YDB-E-LVUNDEF, Undefined local variable: rc
$
Other YottaDB installations in my Debian 9 system are r1.22 (not r1.23 as in output above), so I think the issue is still not resolved.
Thanks. I will try to reproduce the issue in the Docker container like you had mentioned before.
With gtmdbglvl=64 on r1.22 (not r1.23) I have core dump:
Program terminated with signal SIGQUIT, Quit.
#0 0x00007f9f25844227 in kill () at ../sysdeps/unix/syscall-template.S:78
78 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0 0x00007f9f25844227 in kill () at ../sysdeps/unix/syscall-template.S:78
#1 0x00007f9f25a25e68 in gtm_dump_core () from /opt/ydb/libyottadb.so
#2 0x00007f9f25a27a8d in gtm_fork_n_core () from /opt/ydb/libyottadb.so
#3 0x00007f9f25a26562 in generic_signal_handler () from /opt/ydb/libyottadb.so
#4 <signal handler called>
#5 0x00007f9f25b94146 in op_fnget () from /opt/ydb/libyottadb.so
#6 0x00007f9f259dc5e8 in ?? ()
#7 0x000000000000000c in ?? ()
#8 0x00007f9f260df060 in ?? () from /lib64/ld-linux-x86-64.so.2
#9 0x00007f9f24dedda0 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#10 0x00007ffe47ea68e0 in ?? ()
#11 0x00007f9f260e0170 in audit_list_string ()
#12 0x00007f9f25940b9d in __GI___dl_iterate_phdr (callback=0x559ffb76ab60, data=0x559ffb76ab90) at dl-iteratephdr.c:82
#13 0x0000000000000000 in ?? ()
(gdb)
With gtmdbglvl=64 on r1.23 (nars1/YDB-sig11) I have core dump:
Program terminated with signal SIGQUIT, Quit.
#0 __pthread_kill (threadid=<optimized out>, signo=3) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
57 ../sysdeps/unix/sysv/linux/pthread_kill.c: No such file or directory.
(gdb) bt
#0 __pthread_kill (threadid=<optimized out>, signo=3) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
#1 0x00007f610c58c468 in gtm_dump_core () at /root/ydb/sig11/YDB/sr_unix/gtm_dump_core.c:72
#2 0x00007f610c58c780 in gtm_fork_n_core () at /root/ydb/sig11/YDB/sr_unix/gtm_fork_n_core.c:148
#3 0x00007f610c58f72f in generic_signal_handler (sig=<optimized out>, info=<optimized out>, context=<optimized out>)
at /root/ydb/sig11/YDB/sr_unix/generic_signal_handler.c:353
#4 <signal handler called>
#5 op_fnget () at /root/ydb/sig11/YDB/sr_x86_64/op_fnget.s:41
#6 0x00007f610c27b5e8 in ?? ()
#7 0x00007f610c7429d0 in ?? ()
#8 0x00007ffe4dfbb370 in ?? ()
#9 0x000000000000000e in ?? ()
#10 0x00007f610c76c060 in ?? () from /lib64/ld-linux-x86-64.so.2
#11 0x00007f610b852da0 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#12 0x00007ffe4dfbb400 in ?? ()
#13 0x00007f610c76d170 in audit_list_string ()
#14 0x00007f610c1dfb9d in __GI___dl_iterate_phdr (callback=0x56284ba3eb80, data=0x56284ba3ebb0) at dl-iteratephdr.c:82
#15 0x0000000000000000 in ?? ()
(gdb)
You would have gotten a KILLBYSIGSINFO1 message in the output. Can you paste the full output of that for both r1.22 and r1.23.
r1.22:
%YDB-F-KILLBYSIGSINFO1, YottaDB process 24448 has been killed by a signal 11 at address 0x00007F8BCDD40146 (vaddr 0x0000000000000000)
r1.23 (nars1/YDB-sig11):
%YDB-F-KILLBYSIGSINFO1, YottaDB process 24514 has been killed by a signal 11 at address 0x00007FD5697E66C6 (vaddr 0x0000000000000000)
Progress. I am able to reproduce the UNDEF issue with the docker container.
But I am not able to reproduce the UNDEF with the fixed r1.23 version. What is the output of the following in your r1.23 environment.
ldd prog
env | grep LD_LIBRARY_PATH
ls -l $ydb_dist/libyottadb.so
Updated Makefile
:
export GTMCI=$(CURDIR)/callin.tab
export GTMXC=$(CURDIR)/extcall.tab
export plugin=$(CURDIR)/plugin.so
export gtmroutines=
export ydb_routines=$(CURDIR)($(CURDIR))
export gtmdbglvl=64
all: prog plugin.so
prog: prog.c
gcc -Wall -o prog prog.c -fPIC -L$(ydb_dist) -lyottadb -Wl,-rpath,$(ydb_dist) -I$(ydb_dist)
plugin.so: plugin.c
gcc -Wall -shared -o plugin.so plugin.c -fPIC -L$(ydb_dist) -lyottadb -I$(ydb_dist)
run: all
set | grep dbglvl
ldd prog
strings /opt/ydb-351/libyottadb.so | grep 'YottaDB r1'
set | grep LD_LIBRARY_PATH
./prog
clean:
rm -f prog plugin.so test.o
Output from make run
:
set | grep dbglvl
gtmdbglvl='64'
ldd prog
linux-vdso.so.1 (0x00007fffe1568000)
libyottadb.so => /opt/ydb-351/libyottadb.so (0x00007f489f28f000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f489f0c0000)
libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f489eea8000)
libncurses.so.6 => /lib/x86_64-linux-gnu/libncurses.so.6 (0x00007f489ee7f000)
libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007f489ee51000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f489ecbd000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f489ecb6000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f489ec95000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f489ec8b000)
/lib64/ld-linux-x86-64.so.2 (0x00007f489f959000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f489ea71000)
strings /opt/ydb-351/libyottadb.so | grep 'YottaDB r1'
YottaDB r1.23 Linux x86_64
set | grep LD_LIBRARY_PATH
LD_LIBRARY_PATH='/opt/ydb-351:'
./prog
zyre=YottaDB r1.23 Linux x86_64
Error occurred: 150373850,echo+2^test,%YDB-E-LVUNDEF, Undefined local variable: c
Error occurred: 150373850,test1+1^test,%YDB-E-LVUNDEF, Undefined local variable: v
-- zwr shows rc=0 --
rc=0
result="ok"
-- $g(rc) --
rc=
%YDB-F-KILLBYSIGSINFO1, YottaDB process 25172 has been killed by a signal 11 at address 0x00007FF26DB6C6C6 (vaddr 0x0000000000000000)
Thanks. This keeps being elusive. I tried various combinations in the docker container. And was able to see the UNDEF or SIG-11 reliably without the fix and do not see either with the fix. Are you able to reproduce the SIG-11 in a docker container with r1.23?
I have the binaries with the fix (just to eliminate the rare possibility that the build you are using as r1.23 does not contain the fixes). Can you please download this and try this out for your experiment. Once you install this, I expect you to see the following output. That will confirm you are using the right binaries.
YDB>write $zreldate
20190106 16:24
Great! The UNDEF problem has disappeared and I'm not able to reproduce the issue with the sample, thank you.
I see that in ydb_subscript_next_s() started to return YDB_ERR_NODEEND instead of YDB_OK when there are no more subscripts, so I cannot check my code base right now (without refactoring).
I have checked my code base, everything is ok now :) Is your build suitable for production use?
Thanks for reporting the issue confirming the fix works. https://gitlab.com/YottaDB/DB/YDB/issues/401 has been created on gitlab (our official issue tracker) for this. Regarding use in production, I cannot guarantee that as this fix has not gone through our in-house testing. Given you are planning to use this in production, I would like to encourage you to consider buying YottaDB support (you can send email to support@yottadb.com).
Sometimes
s rc=$&func()
setsrc
to a value that is shown in ZWR as0
but is undefined.Code:
sometimes generates:
instead of:
I've tried to isolate the problem as far as possible. The smallest possible code to reproduce it is here: haisenbug.zip
This is kind of heisenbug - e.g. when you change
s b=c
tos a=c
intest.m
it disappears.My environment is:
YottaDB r1.22 Linux x86_64