att / ast

AST - AT&T Software Technology
Eclipse Public License 1.0
561 stars 152 forks source link

"typeset -p .sh.type" dumps core after a defined type is instantiated #1297

Open sneyx123 opened 5 years ago

sneyx123 commented 5 years ago

Description of problem: Lost of awareness control.

Ksh version:

# print -v .sh.version
Version AJMP 93u+ 2012-08-01

How reproducible: Interactive. If scripted the last "typeset -p .sh.type" shall be interactive.

Steps to reproduce:

1. # ksh93
2. # typeset -p .sh.type
   namespace sh.type
   {
        :
   }
3. # typeset -Ttyp1 typ1=(

        function get
        {
                .sh.value="'Sample'";
        }

   )

4. # typeset -p .sh.type
   namespace sh.type
   {
        typeset -r typ1='Sample'
   }

5. # typ1
6. # typ1 var11
   ksh93: typ1: var11: only simple variables can be exported

7. # typ1
    typ1 var11='Sample'

7a. # print $var11
    'Sample'

7b. # print -v var11
    'Sample'

8. # typeset -p .sh.type
   Memory fault(coredump)

9. #

Actual results: Unusable.

Expected results: Output of step 8. should be the same as step 4

Additional info: On request.

# typ1 --help
Usage: typ1 [ options ] [name[=value]...]
OPTIONS
  -r              Enables readonly. Once enabled, the value cannot be changed or unset.
  -a[type]        Indexed array. Each name will converted to an index array of type typ1. If a variable already exists, the current
                  value will become index 0. If [type] is specified, each subscript is interpreted as a value of enumeration type
                  type. The option value may be omitted.
  -A              Associative array. Each name will converted to an associate array of type typ1. If a variable already exists, the
                  current value will become subscript 0.
  -h string       Used within a type definition to provide a help string for variable name. Otherwise, it is ignored.
  -S              Used with a type definition to indicate that the variable is shared by each instance of the type. When used inside
                  a function defined with the function reserved word, the specified variables will have function static scope.
                  Otherwise, the variable is unset prior to processing the assignment list.
krader1961 commented 5 years ago

I can reproduce this using the current master branch from an interactive shell copy/pasting the following script but not when executing it non-interactively:

typeset -p .sh.type
typeset -Ttyp1 typ1=(
        function get
        {
                .sh.value="'Sample'";
        }

   )

typeset -p .sh.type
typ1
typ1 var11
typ1
print $var11
print -v var11
typeset -p .sh.type

Also the output of the first typeset -p .sh.type from a ksh93u+ shell looks like this:

namespace sh.type
{
        :
}

But from ksh built from head it looks like this:

namespace sh.type
{
        typeset -r -s -u -i _Bool=false
}

That probably isn't significant since a SIGSEGV occurs with both ksh versions but it is interesting.

@sneyx123 What caused you to notice this problem? Is this causing you a problem with a production script or were you simply playing around and noticed this bug?

ormaaj commented 5 years ago

There are a gazillion ways to segfault via typeset -p. On the occasion it doesn't crash I often couldn't tell whether .sh.type was producing a wrong representation or I was just being baffled by its design.

krader1961 commented 5 years ago

@ormaaj LOL. Yeah, I don't doubt what you say. I've been chipping away at fixing bugs in the ksh name/value code for over two years. The appalling quality of that code makes me wonder how it manages to work as well as it does. For example, vars that handle nvflag bit fields are defined as int, unsigned, uint32_t, short, unsigned short, and probably a few other variants. Worse is the reuse of variables for unrelated purposes. For example, there are places that do int n; then use n to hold nvflag bit fields and also the return value of functions like strlen(), as loop index vars, etc. Apparently just to avoid a few more words in the current stack frame. Which is another problem since int n = strlen(sp); converts an unsigned 64 bit int on most platforms to a signed 32 bit int.

The question is whether this is something that is likely to break a production script, something that people notice on a frequent basis, or just something that infrequently causes people to ask "WTF?".

sneyx123 commented 5 years ago

To be honest it was a drill down of a namespace navigator problem. Namespace is some kind of relativistic model depending on your eigenvector for variables in ksh. The problem covers a intersection of (1) "self defined types", (2) discipline functions and (3) namespaces. If you rename "get" to "getx" it will not dump the cores. I can not build ksh from git head because of this ninja & meson etc. environment change. With the nice and slow "nmake" everything was fine. Is there a howto "configure and run the environment"?

Since some weeks a major player has decided to add ninja to its distribution:

# pkg info -r developer/build/ninja
             Name: developer/build/ninja
          Summary: Ninja is a small build system with a focus on speed.
      Description: Ninja is a small build system with a focus on speed.  It
                   differs from other build systems in two major respects: it is
                   designed to have its input files generated by a higher-level
                   build system, and it is designed to run builds as fast as
                   possible.
         Category: Development/Distribution Tools
            State: Not installed
        Publisher: solaris
          Version: 1.8.2
    Build Release: 5.11
           Branch: 11.4.7.0.1.2.0
   Packaging Date: February 19, 2019 06:08:54 PM
             Size: 4.68 MB
             FMRI: pkg://solaris/developer/build/ninja@1.8.2,5.11-11.4.7.0.1.2.0:20190219T180854Z
siteshwar commented 5 years ago

@sneyx123 Meson depends on python-3.5 and Solaris does not support it, so you can not build from master on Soalris.

sneyx123 commented 5 years ago
# pkg search $(which python)
INDEX      ACTION VALUE          PACKAGE
path       link   usr/bin/python pkg:/runtime/python-27@2.7.14-0.175.3.32.0.3.0
path       link   usr/bin/python pkg:/runtime/python-34@3.4.8-0.175.3.35.0.1.0
path       link   usr/bin/python pkg:/runtime/python-35@3.5.3-11.4.0.0.1.14.0
path       link   usr/bin/python pkg:/runtime/python-35@3.5.6-11.4.7.0.1.2.0

So maybe last time i have tried to get meson+ninja working it was before it was added:

# pkg info -r pkg:/runtime/python-35@3.5.3-11.4.0.0.1.14.0
             Name: runtime/python-35
          Summary: The Python interpreter, libraries and utilities
         Category: Development/Python
            State: Not installed
        Publisher: solaris
          Version: 3.5.3
    Build Release: 5.11
           Branch: 11.4.0.0.1.14.0
   Packaging Date: August 14, 2018 05:16:39 PM
             Size: 37.84 MB
             FMRI: pkg://solaris/runtime/python-35@3.5.3,5.11-11.4.0.0.1.14.0:20180814T171639Z
siteshwar commented 5 years ago

Afaik meson still does not work on Solaris, but there is attempt to make it work https://github.com/mesonbuild/meson/pull/5008

sneyx123 commented 5 years ago

Looks like it require root permissions for setup ... so maybe if it modifies existing files it is not straightforward packagable.

...
Processing meson-0.50.999-py3.5.egg
creating /usr/lib/python3.5/site-packages/meson-0.50.999-py3.5.egg
Extracting meson-0.50.999-py3.5.egg to /usr/lib/python3.5/site-packages
Adding meson 0.50.999 to easy-install.pth file
Installing meson script to /usr/bin

Installed /usr/lib/python3.5/site-packages/meson-0.50.999-py3.5.egg
Processing dependencies for meson==0.50.999
Finished processing dependencies for meson==0.50.999

How to try a build?

sneyx123 commented 5 years ago

The head version of meson didn't support the solarisstudio compiler. But accept the gcc from the vendor. But now it looks like a ksh source problem ??:

# ninja -C build
ninja: Entering directory `build'
[17/520] Compiling C object 'src/lib/libast/e416224@@ast@sta/conftab.c.o'.
FAILED: src/lib/libast/e416224@@ast@sta/conftab.c.o
gcc -Isrc/lib/libast/e416224@@ast@sta -Isrc/lib/libast -I../src/lib/libast -I. -I../ -I../src/lib/libast/include -Isrc/lib/libast/aso -I../src/lib/libast/aso -Isrc/lib/libast/cdt -I../src/lib/libast/cdt -Isrc/lib/libast/comp -I../src/lib/libast/comp -Isrc/lib/libast/sfio -I../src/lib/libast/sfio -Isrc/lib/libast/path -I../src/lib/libast/path -Isrc/lib/libast/port -I../src/lib/libast/port -Isrc/lib/libast/string -I../src/lib/libast/string -Isrc/lib/libast/misc -I../src/lib/libast/misc -Isrc/lib/libast/tm -I../src/lib/libast/tm -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -g -std=gnu99 -D_GNU_SOURCE -fno-strict-aliasing -Wextra -Wno-cast-function-type -Wno-attributes -Wno-char-subscripts -Wno-sign-compare -fno-omit-frame-pointer -Werror=implicit -fno-inline -fPIC '-DUSAGE_LICENSE=""' -DUSE_SPAWN=0 -DGCC_4_1PLUS_64_BIT_MEMORY_ATOMIC_OPERATIONS_MODEL=1 -MD -MQ 'src/lib/libast/e416224@@ast@sta/conftab.c.o' -MF 'src/lib/libast/e416224@@ast@sta/conftab.c.o.d' -o 'src/lib/libast/e416224@@ast@sta/conftab.c.o' -c /export/home/simon/ASTOPEN/GIT/ast/build/src/lib/libast/comp/conftab.c
/export/home/simon/ASTOPEN/GIT/ast/build/src/lib/libast/comp/conftab.c:44:82: error: ‘_SC_ABI_AIO_XFER_MAX’ undeclared here (not in a function); did you mean ‘_SC_BC_SCALE_MAX’?
 { "ABI_AIO_XFER_MAX", { 0U, 0 }, { 0U, 0 }, CONF_LIMIT, CONF_C, 1, CONF_sysconf, _SC_ABI_AIO_XFER_MAX },
                                                                                  ^~~~~~~~~~~~~~~~~~~~
                                                                                  _SC_BC_SCALE_MAX
/export/home/simon/ASTOPEN/GIT/ast/build/src/lib/libast/comp/conftab.c:44:1: warning: missing initializer for field ‘op’ of ‘Conf_t {aka const struct Conf_s}’ [-Wmissing-field-initializers]
 { "ABI_AIO_XFER_MAX", { 0U, 0 }, { 0U, 0 }, CONF_LIMIT, CONF_C, 1, CONF_sysconf, _SC_ABI_AIO_XFER_MAX },
 ^
In file included from /export/home/simon/ASTOPEN/GIT/ast/build/src/lib/libast/comp/conftab.c:13:0:
/export/home/simon/ASTOPEN/GIT/ast/build/src/lib/libast/comp/conftab.h:72:15: note: ‘op’ declared here
         short op;
               ^~
sneyx123 commented 5 years ago

Only a "few" failed:

# ninja  -k 99 -C build 2>&1|grep "^FAILED"
FAILED: src/lib/libast/e416224@@ast@sta/string_strperm.c.o
FAILED: src/lib/libast/e416224@@ast@sta/sfio_sftable.c.o
FAILED: src/lib/libast/e416224@@ast@sta/sfio_sfpopen.c.o
FAILED: src/lib/libast/e416224@@ast@sta/misc_procclose.c.o
FAILED: src/lib/libast/e416224@@ast@sta/sfio_sfmode.c.o
FAILED: src/lib/libast/e416224@@ast@sta/tm_tvtouch.c.o
FAILED: src/lib/libast/e416224@@ast@sta/port_astconf.c.o
FAILED: src/lib/libast/e416224@@ast@sta/misc_procopen.c.o
FAILED: src/lib/libast/e416224@@ast@sta/misc_procfree.c.o
FAILED: src/lib/libcmd/fc2c6cf@@cmd@sta/chmod.c.o
FAILED: src/lib/libcmd/fc2c6cf@@cmd@sta/uname.c.o
FAILED: src/lib/libast/e416224@@ast@sta/conftab.c.o
krader1961 commented 5 years ago

The conftab.c module is dynamically generated by the src/lib/libast/comp/conf.sh script. It's part of the ksh getconf builtin and associated astconf() API that we desperately want to remove. We haven't done so because we wanted to create a ksh release before making such a major change. We've had problems with that script. Most recently #1107 which involved modifying that script to honor the CC env var. The failures in your comment, @sneyx123, are exactly what we saw in #1107. Are you running env CC=gcc meson to configure the build? If not that might explain the failure.

Also, what distro are you using? That's almost always relevant when reporting problems with this project and therefore should be mentioned in the original problem statement.

sneyx123 commented 5 years ago

As far as I understand has meson choosen itself which compiler it uses. Initially no compiler was installed. Then the "cc" out of the "developer/solarisstudei-124/cc" pkg was not suitable and finally the "gcc" out of "developer/gcc" pkg was accepted. I will retry if other options are successful...

Relevant distro details:

# cat /etc/release
                            Oracle Solaris 11.4 SPARC
  Copyright (c) 1983, 2019, Oracle and/or its affiliates.  All rights reserved.
                           Assembled 01 February 2019

# pkg list -av entire shell/ksh93 gcc ninja developer/solarisstudio-124/cc
FMRI                                                                         IFO
pkg://solaris/developer/build/ninja@1.8.2-11.4.7.0.1.2.0:20190219T180854Z    i--
pkg://solaris/developer/gcc@7.3.0-11.4.0.0.1.14.0:20180814T162301Z           i--
pkg://solarisstudio/developer/solarisstudio-124/cc@12.4-1.0.7.0:20171101T225718Z i--
pkg://solaris/entire@11.4-11.4.6.0.1.4.0:20190201T214604Z                    i--
pkg://solaris/shell/ksh93@93.21.1.20120801-11.4.0.0.1.14.0:20180814T172407Z  i--

# uname -a
SunOS xxx 5.11 11.4.6.4.0 sun4v sparc sun4v

ast# git log | head -3
commit 973a7c2d80c8a3946cd05b6ef597cb120ac697d5
Author: Kurtis Rader <krader@skepticism.us>
Date:   Tue Apr 30 20:22:44 2019 -0700

meson# git log|head -3
commit ccc4ce28cc9077d77a0bc9e72b1177eba1be7186
Author: Jon Turney <jon.turney@dronecode.org.uk>
Date:   Sun Apr 28 21:06:36 2019 +0100
sneyx123 commented 5 years ago

Anything wrong here?:

# meson list
The Meson build system
Version: 0.50.999
Source dir: /export/home/simon/ASTOPEN/GIT/ast
Build dir: /export/home/simon/ASTOPEN/GIT/ast/list
Build type: native build
Project name: ksh93
Project version: undefined
Native C compiler: gcc (gcc 7.3.0 "gcc (GCC) 7.3.0")
WARNING: Unknown CPU family 'sun4v', please report this at https://github.com/mesonbuild/meson/issues/new with theoutput of `uname -a` and `cat /proc/cpuinfo`
Build machine cpu family: sun4v
Build machine cpu: sun4v
Compiler for C supports arguments -Werror=implicit: YES
Checking for size of "void*" : 8
Checking for size of "int" : 4
Checking for size of "long" : 8
Checking for size of "size_t" : 8
Checking for size of "off_t" : 8
Checking for size of "int32_t" : 4
Checking for size of "wchar_t" : 4
Compiler for C supports link arguments -Wl,--export-dynamic: NO
Checking for size of "long long" : 8
Library m found: NO
Library dl found: NO
Library execinfo found: NO
Library iconv found: NO
Library catgets found: NO
Has header "execinfo.h" : YES
Has header "stdlib.h" : YES
Has header "malloc.h" : YES
Has header "filio.h" : NO
Has header "sys/filio.h" : YES
Checking for function "lchmod" : NO
Checking for function "sigqueue" : YES
Checking for function "isnanl" : NO
Checking for function "eaccess" : YES
Checking for function "euidaccess" : YES
Checking for function "faccessat" : YES
Checking for function "mkostemp" : NO
Checking for function "strlcat" : YES
Checking for function "utimensat" : YES
Checking for function "sysinfo" : YES
Checking for function "pipe2" : YES
Checking for function "syncfs" : NO
Checking for function "nexttowardl" with dependency -lm: NO
Checking for function "expm1l" with dependency -lm: NO
Checking for function "log1pl" with dependency -lm: NO
Checking for function "remainderl" with dependency -lm: NO
Checking for function "log2l" with dependency -lm: NO
Checking for function "tgammal" with dependency -lm: NO
Checking for function "lgammal" with dependency -lm: NO
Checking if "fchmod() after socketpair() shutdown()" runs: NO (1)
Checking if "max signal number" runs: YES
Checking whether type "struct dirent" has member "d_fileno" : NO
Checking whether type "struct dirent" has member "d_ino" : YES
Checking whether type "struct dirent" has member "d_reclen" : YES
Checking whether type "struct dirent" has member "d_type" : NO
Checking whether type "struct dirent" has member "d_namlen" : NO
Checking whether type "struct stat" has member "st_mtim" : YES
Checking if "poll() exists and is worth using" runs: YES
Checking if "posix_spawn() exists and is worth using" runs: YES
Checking if "Check if -D_FILE_OFFSET_BITS=64 works with fts functions" compiles: YES
Program tput found: YES (/usr/bin/tput)
Program ed found: YES (/usr/bin/ed)
Program atos found: NO
Program addr2line found: NO
Has header "dl.h" : NO
Has header "dlfcn.h" : YES
Has header "dll.h" : NO
Has header "rld_interface.h" : NO
Has header "mach-o/dyld.h" : NO
Has header "sys/ldr.h" : NO
Library dl found: NO
Checking for function "dlopen" with dependency -ldl: YES
Checking for function "dllload" with dependency -ldl: NO
Checking for function "loadbind" with dependency -ldl: NO
Checking if "_DYNAMIC check" runs: DID NOT COMPILE
Checking for function "clock_gettime" : YES
Checking for function "gettimeofday" : YES
Has header "sys/syscall.h" : YES
Has header "sys/systeminfo.h" : YES
Has header "sys/syssgi.h" : NO
Checking for function "syscall" : NO
Checking for function "systeminfo" : NO
Configuring config_ast.h using configuration
Checking if "gcc 4.1+ 64 bit memory atomic operations model" links: YES
Program sh found: YES (/usr/bin/sh)
src/cmd/ksh93/tests/meson.build:86: WARNING: skipping io/shcomp on sunos
src/cmd/ksh93/tests/meson.build:86: WARNING: skipping set/shcomp on sunos
src/cmd/ksh93/tests/meson.build:86: WARNING: skipping treemove/shcomp on sunos
Build targets in project: 128
Found ninja-1.8.2 at /usr/bin/ninja
#
krader1961 commented 5 years ago

Anything wrong here?

@sneyx123, No, there is nothing obvious in that output which would cause us to recognize a problem building ksh on your system. However, unexpected problems from the conf.sh script will not be reflected in that output. You need to examine the meson-logs/meson-log.txt file to figure out why that script is producing erroneous results.

FWIW, I just spent four hours installing OpenIndiana and trying to install all the prerequisites to build ksh with meson. It fails with this cryptic error:

meson.build:1:0: ERROR: Unknown linker(s): [['/usr/local/bin/gcc-ar']]
krader1961 commented 5 years ago

Also, please read #1178 and #994 where we recognize this project is unlikely to build or pass the unit tests on a SVR4 based system like Solaris. Fixing those issues is going to require the assistance of someone like yourself, @sneyx123, who cares about keeping ksh relevant on those platforms.

sneyx123 commented 5 years ago

Just to document that part -- after presenting meson "gcc" ... the "cc" was revoked -- but now this "conf.sh" script try to use a "cc" and also some legacy parts of the "AT&T" buildsystem e.g. "hosttype":

...
   2048 Configuring config_ast.h using configuration
   2049 Running command: /export/home/simon/ASTOPEN/GIT/ast/scripts/libast_prereq.sh
   2050 --- stdout ---
   2051
   2052 --- stderr ---
   2053 /export/home/simon/ASTOPEN/GIT/ast/scripts/libast_prereq.sh: line 28: cc: not found
   2054 gcc: error: unrecognized command line option '-ferror-limit=0'; did you mean '-finline-limit='?
   2055 conf.sh: read /export/home/simon/ASTOPEN/GIT/ast/src/lib/libast/comp/conf.tab
   2056 conf.sh: check /usr/bin/getconf(1),confstr(2),pathconf(2),sysconf(2),sysinfo(2) configuration names
   2057 /export/home/simon/ASTOPEN/GIT/ast/src/lib/libast/comp/conf.sh: line 341: /export/home/simon/ASTOPEN/GIT/ast/bin/hosttype: not found
   2058 conf.sh: check macros/enums as static initializers
   2059 conf.sh: probe for ABI_AIO_XFER_MAX <limits.h> value
   2060 /tmp/ksh..yHCPb/conf.c: In function 'main':
   2061 /tmp/ksh..yHCPb/conf.c:20:49: error: 'ABI_AIO_XFER_MAX' undeclared (first use in this function); did you mean '_SC_TIMER_MAX'?
   2062          printf("%lu\n", (unsigned _ast_intmax_t)ABI_AIO_XFER_MAX);
   2063                                                  ^~~~~~~~~~~~~~~~
   2064                                                  _SC_TIMER_MAX
   2065 /tmp/ksh..yHCPb/conf.c:20:49: note: each undeclared identifier is reported only once for each function it appears in
   2066 conf.sh: probe for ABI_ASYNCHRONOUS_IO <limits.h> value
   2067 /tmp/ksh..yHCPb/conf.c: In function 'main':
   2068 /tmp/ksh..yHCPb/conf.c:20:40: error: 'ABI_ASYNCHRONOUS_IO' undeclared (first use in this function); did you mean '_SC_ASYNCHRONOUS_IO'?
   2069          printf("%ld\n", (_ast_intmax_t)ABI_ASYNCHRONOUS_IO);
   2070                                         ^~~~~~~~~~~~~~~~~~~
   2071                                         _SC_ASYNCHRONOUS_IO
   2072 /tmp/ksh..yHCPb/conf.c:20:40: note: each undeclared identifier is reported only once for each function it appears in

...
sneyx123 commented 5 years ago

Code workaround are in place and compiler is happy ... but the build system has to learn how to link program e.g. the test programs:

# ninja -k 99 -C build 2>&1|grep "^ld"|uniq -c
 104 ld: elf error: file /export/home/simon/ASTOPEN/GIT/ast/build/src/lib/libast: elf_begin: I/O error: region read: Is a directory
sneyx123 commented 5 years ago

FYI: in "src/lib/libast/sfio/sftable.c" line 422 of 530 There is some funny "{" "}" nesting if the "#ifdef" is true :

                    case SFFMT_FLOAT:
#if !_ast_fltmax_double
                        if (size == sizeof(Sfdouble_t)) {
                            fp[n].argv.ld = va_arg(args, Sfdouble_t);
                        } else {
#endif
                            fp[n].argv.d = va_arg(args, double);
                        }
                        break;
sneyx123 commented 5 years ago

Same nesting fun in: "src/lib/libast/sfio/sfvprintf.c" line 629 of 1417

sneyx123 commented 5 years ago

Porting of git head version to solaris 11.4 (SPARC) successful -- first ksh2020 model:

#  ./build/src/cmd/ksh93/ksh
# print -v .sh.version
Version A 2020.0.0-alpha1-59-g973a7c2d-dirty

Howto converge changes into the git head version ... ?

sneyx123 commented 5 years ago

And I can confirm that the bug of #1297 exists from 2012(1993?)-2019 -- but there is a nice integrated stack backtrace now:

# typeset -p .sh.type
### 5759 Function backtrace:
1   handle_sigsegv + 24
2   __sighndlr + 12
3   call_user_handler + 852
4   sigacthandler + 84
5   nv_getval + 28
6   sh_funscope + 2916
7   sh_funct + 1112
8   sh_fun + 1168
9   lookup + 576
10  lookups + 28
11  nv_getv + 348
12  nv_getval + 600
13  local_exports + 84
14  scanfilter + 944
15  nv_scan + 224
16  sh_funscope + 700
17  sh_funct + 1112
18  sh_fun + 1168
19  lookup + 576
20  lookups + 28
Abort(coredump)

Also the initial state of ".sh.type" is different:

namespace sh.type
{
        typeset -r -s -u -i _Bool=false
}
sneyx1234 commented 5 years ago

git is unable to handle multiple forks ... so sneyx1234/ast and not sneyx123 is used for the solaris 11.4 converge procedure ...

sneyx1234 commented 5 years ago

FYI: The linker problem of meson for solaris 11.4 is solved in the "sneyx1234/meson" fork of "mesonbuild/meson"

sneyx1234 commented 5 years ago

FYI: The first run of "meson test --setup=malloc" shows:

...
Ok:                  203
Expected Fail:         0
Fail:                 32
Unexpected Pass:       0
Skipped:               0
Timeout:              41
...

Are there "no" errors for "all" platforms ?

krader1961 commented 5 years ago

There is some funny "{" "}" nesting if the "#ifdef" is true :

Not surprised. We've cleaned up many such problems in the past two years as we've added more distros to environments we test on. These types of issues are mostly due to bit rot because certain configurations never get tested which makes it too easy to introduce a change that works on platform X but not Y.

but there is a nice integrated stack backtrace now:

That was one of the first enhancements I made. When we started working on this two years ago there were lots of SIGSEGV failures on Travis CI where we had no chance of getting and analyzing a core dump. I implemented the integrated backtrace to give us some idea where the failure occurred. On Linux and BSD I've enhanced it to produce better backtraces that include line numbers.

Are there "no" errors for "all" platforms ?

A year ago that number of failures was pretty typical on all platforms we tested on. In a few instances the failure was because the test was wrong. In a few more cases the failure reflected an actual ksh bug. And in many other cases the test needed to be adapted to the platform or skipped. For example, BSD (including macOS) doesn't have queued signals. So that unit test needs to be skipped on those platforms.

Update: Today there are no failing tests (other than an occasional flakey failure) on macOS, FreeBSD, OpenBSD, and several flavors of Linux.

krader1961 commented 5 years ago

@sneyx123 Also, if you're inclined to try and debug problems you'll find the DPRINTF(), DPRINT_NV(), DPRINT_NR() and DPRINT_VT() macros useful . See the bottom of the config_ast.h.in module.

krader1961 commented 5 years ago

@sneyx123 FWIW, I also tried to install fish which is my preferred interactive shell. It builds but fails to run with lots of errors starting with being unable to figure out the CWD. And that is a modern shell under active development. I was surprised it fails to work on Solaris (or in my case OpenIndiana). In light of that I'm not too surprised the current ksh project also has problems on that platform.

krader1961 commented 5 years ago

There is some funny "{" "}" nesting if the "#ifdef" is true :

Note that we do not define _ast_fltmax_double so the block of code predicated on that symbol is always included. Which is why it compiles despite the unbalanced braces if it was defined. Before switching to Meson that was defined by a nmake/IFFE test. In the 2012-08-01-master branch look at ./src/lib/libast/features/common to see how it was optionally defined. Just glancing at that config time test I can't figure out what it is doing.

sneyx1234 commented 5 years ago

I386:

# /tmp/ast_fltmax_double
#define _ast_flt4_t            float
#define _ast_flt8_t            double
#define _ast_fltmax_t           _ast_flt8_t
#define _ast_fltmax_double              1
sneyx1234 commented 5 years ago

After preprocessor macro expansion it look on "i386" like this:

...
static struct
{
        char*   name;
        int     size;
} flt_type[] =
{
        "float",        sizeof(float),
        "double",       sizeof(double),
};

int
main()
{
        register int    t;
        register int    m = 1;

        for (t = 0; t <  ( sizeof ( flt_type ) / sizeof ( flt_type [ 0 ] ) ); t++)
        {
                while (t < ( ( sizeof ( flt_type ) / sizeof ( flt_type [ 0 ] ) ) - 1) && flt_type[t].size == flt_type[t + 1].size)
                        t++;
                m = flt_type[t].size;
                printf("#define _ast_flt%d_t            %s\n", flt_type[t].size, flt_type[t].name);
        }
        printf("#define _ast_fltmax_t           _ast_flt%d_t\n", m);
        if (m == sizeof(double))
                printf("#define _ast_fltmax_double              1\n");
        return 0;
}

#ident "acomp: Sun C 5.12 SunOS_i386 Patch 148918-09 2014/09/10"
sneyx1234 commented 5 years ago

SPARC:

#define _ast_flt4_t            float
#define _ast_flt8_t            double
#define _ast_fltmax_t           _ast_flt8_t
#define _ast_fltmax_double             1
sneyx1234 commented 5 years ago

So sizeof(double) from global data scope is always sizeof(double) in function scope?

N.B. The size of a function argument double(s) must have stack element size, the size of stack elements is independent of the "type" must be equal ... else varargs() will fail.

krader1961 commented 5 years ago

So sizeof(double) from global data scope is always sizeof(double) in function scope?

Yes, that has to be true given the C language specification. What you seem to be asking is why we don't #define _ast_fltmax_double 1 on your platform. The answer is that it wasn't true on any of the Linux or BSD platforms we were using to test changes when we switched from Nmake+IFFE to Meson+Ninja. And I still don't understand why _ast_fltmax_double would be one rather than zero as we currently assume. It may be that we need to reinstate that feature test. But someone will need to explain the point of that feature test. It may be that symbol _ast_fltmax_double is no longer relevant; even on Solaris. On the other hand it may be that it is still relevant, even on Linux and BSD platforms, and the omission of that feature test when switching to Meson was a mistake.

sneyx1234 commented 5 years ago

Can you see the light at BBI_SOL11_4 (this line was added for this snippet) below:

#define _BYTESEX_H

#include <string.h>
#include <sys/types.h>

#if !N || !_STD_
#undef  _typ_long_double
#endif

#define elementsof(x)   (sizeof(x)/sizeof(x[0]))

#define _typ_long_double        /* BBI_SOL11_4 reveals the secret */

static struct
{
        char*   name;
        int     size;
} flt_type[] =
{
        "float",        sizeof(float),
        "double",       sizeof(double),
#ifdef _typ_long_double
        "long double",  sizeof(long double),
#endif
};

int
main()
{
        register int    t;
        register int    m = 1;

#ifdef _typ_long_double
        long double     p;
        char            buf[64];

        if (flt_type[elementsof(flt_type)-1].size <= sizeof(double))
                return 1;
        p = 1.12345E-55;
        sprintf(buf, "%1.5LE", p);
        if (strcmp(buf, "1.12345E-55"))
                return 1;
#endif
        for (t = 0; t < elementsof(flt_type); t++)
        {
                while (t < (elementsof(flt_type) - 1) && flt_type[t].size == flt_type[t + 1].size)
                        t++;
                m = flt_type[t].size;
                printf("#define _ast_flt%d_t            %s\n", flt_type[t].size, flt_type[t].name);
        }
        printf("#define _ast_fltmax_t           _ast_flt%d_t\n", m);
        if (m == sizeof(double))
                printf("#define _ast_fltmax_double              1\n");
        return 0;
}

Output is:

# ./ast_fltmax_double
#define _ast_flt4_t            float
#define _ast_flt8_t            double
#define _ast_flt12_t            long double
#define _ast_fltmax_t           _ast_flt12_t
sneyx1234 commented 5 years ago

So if the "processor architecture" supports "long double" the "_ast_fltmax_double" is not defined.

sneyx1234 commented 5 years ago

But in this rocket science also the behavior of the library containting the "sprintf" and it conversion ability for long double "%...LE" have to be taken into account ... Also the "compiler" ... also if it is the second stage/pass if "sprintf" was/is part of something ...

sneyx1234 commented 5 years ago

So next question is what is the expression for "_typ_long_double" in 2012?

sneyx1234 commented 5 years ago

First line of "./arch/sol11.i386/src/lib/libast/FEATURE/common" says:

/* : : generated from /var/tmp/repos_3s/trunk/xops/ASTOPEN/GIT/ast/src/lib/libast/features/common by iffe version 2012-07-17 : : */

Found name references for "_typ_long_double" are:

# find "."  -type d -name '.svn' -prune -o -type f -print | xargs grep _typ_long_double
./arch/sol11.i386/src/lib/libast/FEATURE/common:#define _typ_long_double        1       /* long double is a type */
./arch/sol11.i386/src/lib/libast/ast_common.h:#define _typ_long_double  1       /* long double is a type */
./arch/sol11.i386/include/ast/ast_common.h:#define _typ_long_double     1       /* long double is a type */
./src/cmd/tests/sfio/tlongdouble.c:#if _typ_long_double
./src/cmd/tests/sfio/tlongdouble.c:#if _typ_long_double
./src/cmd/builtin/od.c:#if _typ_long_double
./src/cmd/builtin/od.c:#if _typ_long_double
./src/cmd/builtin/od.c:#if _typ_long_double
./src/cmd/builtin/od.c:#if _typ_long_double
./src/cmd/builtin/od.c:#if _typ_long_double
./src/cmd/builtin/od.c:#if _typ_long_double
./src/cmd/ksh93/features/math.sh:               case $_typ_long_double in
./src/cmd/ksh93/features/math.sh:                                       case $_typ_long_double in
./src/cmd/ksh93/features/math.sh:               local=$_typ_long_double
./src/lib/libast/astsa/mkast_sa:        #ifdef _typ_long_double
./src/lib/libast/features/map.c:        printf("#undef  _typ_long_double\n");
./src/lib/libast/features/common:       #ifdef _typ_long_double
./src/lib/libast/features/common:       #undef  _typ_long_double
./src/lib/libast/features/common:       #ifdef _typ_long_double
./src/lib/libast/features/common:       #ifdef _typ_long_double
sneyx1234 commented 5 years ago

FWIW, I just spent four hours installing OpenIndiana and trying to install all the prerequisites to build ksh with meson. It fails with this cryptic error:

meson.build:1:0: ERROR: Unknown linker(s): [['/usr/local/bin/gcc-ar']]

In my past 38yrs with Solaris i have never used Openindiana, except its predecessor Opensolaris aka as Solaris 11.0 ... In the worst case you must start with the bootstrap of a compiler ... but I would expect that some of the latest releases should contain a working development environment. "ar" is not the linker ... should be "ld" ... there where always different permutation pairing of cc+as+ld+ar inuse. I can give it a try in a VirtualBox with the "OpenIndiana Hipster 2018.10 Live DVD (64-bit x86)" image.

sneyx1234 commented 5 years ago

FYI: Adjusted from -D_ast_fltmax_double=1 to -D_ast_fltmax_double=0 because this output is expected:

# print $(( (3.1415926/2.0) ))
1.5707963

and not:

# print $(( (3.1415926/2.0) ))
0

Means also that the definition can be dropped at all in meson.build.

Also improved (203 ==>> 256):

cd build
meson test --setup=malloc
cd "${OLDPWD}"

...
Ok:                  256
Expected Fail:         0
Fail:                 18
Unexpected Pass:       0
Skipped:               0
Timeout:               2
...
krader1961 commented 5 years ago

@sneyx123 As I predicted in a previous comment it is clear that lots of unit test failures are due to quirks of the Solaris platform. Such as returning an errno that is equivalent to the string "not supported" rather than "unlimited". Which is unlike the BSD and Linux platforms. The Solaris behavior is not wrong. But it is different from the other distros we have been testing on. So this is a test that should be skipped on Solaris/SunOS.

sneyx1234 commented 5 years ago

Please keep in mind ... AT&T has defined what the behavoir of UNIX is and what can call itself UNIX ... and the company fusion path was AT&T --> Sun Micro Systems --> Oracle ? Solaris was the effort to bring SunOS 4.1.1 (BSD derived) and AT&T System V to Solaris 2.1 (X86) ...

sneyx1234 commented 5 years ago

Also Mr. S. R. Bourne was after the fusion employee of Sun? ksh is a fork of sh?

sneyx1234 commented 5 years ago

Bell Labs --> Lucent -> Alcatel (Standard Elektrik Lorenz SEL)

krader1961 commented 5 years ago

@sneyx1234 I was born in 1961 and fell in love with UNIX sometime around 1982. I am not a lawyer. For all I know "UNIX" is a trademark owned by AT&T. But the UNIX/SUS/POSIX standards we care about today are not controlled by AT&T. Take your diatribes about intellectual property rights somewhere else.

sneyx1234 commented 5 years ago

I'am not a fan to kick a statue from its socket.

sneyx1234 commented 5 years ago

1982 I have payed $50 for my first Sun 3/50 ... before 1982 build VME H/W with complete S5R4V1 with Motorola 68010 with 7 parallel MMUs ... compile the compiler ... create a proto filesystem for the first boot ... there was a "sh" included :-)

ormaaj commented 5 years ago

@sneyx1234 For all I know "UNIX" is a trademark owned by AT&T. But the UNIX/SUS/POSIX standards we care about today are not controlled by AT&T.

Pretty sure AT&T let the open group have the trademark too.

And anyway AT&T has no interest in that failed UNIX experiment. I heard they're almost finished with a way better replacement. So exciting

sneyx1234 commented 5 years ago

Plan 9 or Plan 2019 ?