OCamlPro / gnucobol

A clone of the sourceforge GnuCOBOL compiler from COBOL to C.
https://get-superbol.com
GNU Lesser General Public License v3.0
16 stars 20 forks source link

Move static variables to local state structures in libcob/strings.c #137

Closed engboris closed 1 week ago

engboris commented 5 months ago

Answer to feature request #448 on Sourceforge.

Problem

The handling of the INSPECT, STRING and UNSTRING statements in libcob/strings.c is an obvious source of unthread-safe code because it relies on a lot of static variables.

The idea was to:

Comments about the modifications

In each generated C program, state structures are defined as local variables and used by the functions for INSPECT, STRING and UNSTRING statements.

Some "local" variables where static without any serious reason: figurative_ptr, figurative_size, alpha_fld and str_cob_low. The static keyword has been removed.

:warning: The cobglobptr variable is still static. I am not sure how to deal with it. What I have in mind is to turn it into a function argument and pass it to function calls in the generated C code (where it is defined as a local variable in .c.l.h files). But this probably goes outside the scope of the intial problem.

Testing

I added a new test suite tests/testsuite.src/backcomp_strings.at which contains the C code (compiled with the compiler before my modifications) for each tests of tests/testsuite.src/run_misc.at dealing with the INSPECT, STRING and UNSTRING statements.

All tests of the new test suite passes with both the version of the compiler before and after my modifications.

GitMensch commented 4 months ago

⚠️ The cobglobptr variable is still static. I am not sure how to deal with it. What I have in mind is to turn it into a function argument and pass it to function calls in the generated C code (where it is defined as a local variable in .c.l.h files). But this probably goes outside the scope of the intial problem.

Correct. That would also be a much more global change with extensive tests of actual threading needed. Postponed to later. Which variables in cobglobptr do you consider important for thread safety? Not all, right?

Some "local" variables where static without any serious reason: figurative_ptr, figurative_size, alpha_fld and str_cob_low. The static keyword has been removed.

str_cob_low -> that's an unchanged field as it is now a static version makes most sense (may change with #136 though); for the others - the previous reason was a performance consideration. Especially for a big figurative_size.

All tests of the new test suite passes with both the version of the compiler before and after my modifications.

Sounds good. Just have seen what I've did when changing existing ABI last time: https://github.com/OCamlPro/gnucobol/blob/gcos4gnucobol-3.x/HACKING#L102 - so please have GnuCOBOL 3.1 installed on the system, then run make -C tests/cobol85 test-local-compat -j6 (ideally again with a 2.2 installed); as noted in HACKING, this can also be tested with a build-tree only.

...now on to the actual review.

engboris commented 4 months ago

str_cob_low -> that's an unchanged field as it is now a static version makes most sense (may change with https://github.com/OCamlPro/gnucobol/pull/136 though); for the others - the previous reason was a performance consideration. Especially for a big figurative_size.

Do you mean these variables should remain static?

GitMensch commented 4 months ago

Do you mean these variables should remain static?

Not necessary, that's just something to watch out for. I'm nearly done with the first review (there will be another one, after your next push, but that already gives you something to work on) - overall looks good.

engboris commented 3 months ago

I did most fixes. However:

Hence, I had to compile COBOL programs with newer version of the compiler (>2.2), some being compiled with GnuCOBOL 3.2.

GitMensch commented 3 months ago

The backward compatibility test did not work for the following tests

  • INSPECT numeric signed
  • INSPECT TALLYING REPLACING BEFORE and AFTER
  • STRING / UNSTRING [NOT] ON OVERFLOW
  • INSPECT TRAILING
  • EXAMINE REPLACING

What do you mean by this? EXAMINE may not exists in 2,2, then a generation with 3.1 to get the "C test code" would be necessary, but the others should compile since a long; do you say that the code does not link/work any more (=you found broken backward compatibility)?

engboris commented 3 months ago

The backward compatibility test did not work for the following tests

  • INSPECT numeric signed
  • INSPECT TALLYING REPLACING BEFORE and AFTER
  • STRING / UNSTRING [NOT] ON OVERFLOW
  • INSPECT TRAILING
  • EXAMINE REPLACING

What do you mean by this? EXAMINE may not exists in 2,2, then a generation with 3.1 to get the "C test code" would be necessary, but the others should compile since a long; do you say that the code does not link/work any more (=you found broken backward compatibility)?

If I compile INSPECT numeric signed (from run_misc.at) with GnuCOBOL 2.2, I get the following source code :

/* Generated by           cobc 2.2.0 */
/* Generated from         prog.cob */
/* Generated at           avril 23 2024 18:49:00 */
/* GnuCOBOL build date    Apr 10 2024 16:39:16 */
/* GnuCOBOL package date  Sep 06 2017 18:45:29 UTC */
/* Compile command        /opt/gnucobol/gnucobol-2.2/bin/cobc -Cx -fno-computed-goto prog.cob */

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <string.h>
#include <math.h>
#define  COB_KEYWORD_INLINE __inline
#include <libcob.h>

#define  COB_SOURCE_FILE        "prog.cob"
#define  COB_PACKAGE_VERSION        "2.2"
#define  COB_PATCH_LEVEL        0
#define  COB_MODULE_FORMATTED_DATE  "avril 23 2024 18:49:00"
#define  COB_MODULE_DATE        20240423
#define  COB_MODULE_TIME        184900

/* Global variables */
/* Generated by           cobc 2.2.0 */
/* Generated from         prog.cob */
/* Generated at           avril 23 2024 18:49:00 */
/* GnuCOBOL build date    Apr 10 2024 16:39:16 */
/* GnuCOBOL package date  Sep 06 2017 18:45:29 UTC */
/* Compile command        /opt/gnucobol/gnucobol-2.2/bin/cobc -Cx -fno-computed-goto prog.cob */

/* Module path */
static const char       *cob_module_path = NULL;

/* Number of call parameters */
static int      cob_call_params = 0;

/* Attributes */

static const cob_field_attr a_1 =   {0x10,   1,   0, 0x1000, NULL};
static const cob_field_attr a_2 =   {0x11,   9,   0, 0x0041, NULL};
static const cob_field_attr a_3 =   {0x10,   2,   0, 0x0007, NULL};
static const cob_field_attr a_4 =   {0x21,   0,   0, 0x1000, NULL};
static const cob_field_attr a_5 =   {0x10,   2,   0, 0x0003, NULL};
static const cob_field_attr a_6 =   {0x10,   2,   0, 0x0001, NULL};

/* Constants */
static const cob_field c_1  = {1, (cob_u8_ptr)"0", &a_1};
static const cob_field c_2  = {1, (cob_u8_ptr)"1", &a_4};
static const cob_field c_3  = {5, (cob_u8_ptr)"T1 - ", &a_4};
static const cob_field c_4  = {5, (cob_u8_ptr)"T2 - ", &a_4};
static const cob_field c_5  = {5, (cob_u8_ptr)"T3 - ", &a_4};
static const cob_field c_6  = {5, (cob_u8_ptr)"T4 - ", &a_4};
static const cob_field c_7  = {5, (cob_u8_ptr)"T5 - ", &a_4};
static const cob_field c_8  = {5, (cob_u8_ptr)"T6 - ", &a_4};
static const cob_field c_9  = {5, (cob_u8_ptr)"T7 - ", &a_4};
static const cob_field c_10 = {5, (cob_u8_ptr)"T8 - ", &a_4};
static const cob_field c_11 = {3, (cob_u8_ptr)"123", &a_4};
static const cob_field c_12 = {3, (cob_u8_ptr)"234", &a_4};
static const cob_field c_13 = {5, (cob_u8_ptr)"C1 - ", &a_4};
static const cob_field c_14 = {3, (cob_u8_ptr)"-22", &a_3};
static const cob_field c_15 = {5, (cob_u8_ptr)"C2 - ", &a_4};
static const cob_field c_16 = {3, (cob_u8_ptr)"+22", &a_3};
static const cob_field c_17 = {5, (cob_u8_ptr)"C3 - ", &a_4};
static const cob_field c_18 = {5, (cob_u8_ptr)"C4 - ", &a_4};
static const cob_field c_19 = {5, (cob_u8_ptr)"C5 - ", &a_4};
static const cob_field c_20 = {5, (cob_u8_ptr)"C6 - ", &a_4};
static const cob_field c_21 = {5, (cob_u8_ptr)"C7 - ", &a_4};
static const cob_field c_22 = {5, (cob_u8_ptr)"C8 - ", &a_4};
static const cob_field c_23 = {1, (cob_u8_ptr)"2", &a_4};
static const cob_field c_24 = {1, (cob_u8_ptr)"3", &a_4};
static const cob_field c_25 = {1, (cob_u8_ptr)"4", &a_4};
static const cob_field c_26 = {5, (cob_u8_ptr)"R1 - ", &a_4};
static const cob_field c_27 = {3, (cob_u8_ptr)"-33", &a_3};
static const cob_field c_28 = {5, (cob_u8_ptr)"R2 - ", &a_4};
static const cob_field c_29 = {3, (cob_u8_ptr)"+33", &a_3};
static const cob_field c_30 = {5, (cob_u8_ptr)"R3 - ", &a_4};
static const cob_field c_31 = {5, (cob_u8_ptr)"R4 - ", &a_4};
static const cob_field c_32 = {5, (cob_u8_ptr)"R5 - ", &a_4};
static const cob_field c_33 = {5, (cob_u8_ptr)"R6 - ", &a_4};
static const cob_field c_34 = {5, (cob_u8_ptr)"R7 - ", &a_4};
static const cob_field c_35 = {5, (cob_u8_ptr)"R8 - ", &a_4};
static const cob_field c_36 = {3, (cob_u8_ptr)"-11", &a_3};
static const cob_field c_37 = {3, (cob_u8_ptr)"+11", &a_3};

static COB_INLINE COB_A_INLINE int
cob_cmp_s32 (const void *p, const cob_s64_t n)
{
    int val;
    val = *(const int __unaligned *)p;
    return (val < n) ? -1 : (val > n);
}

    /* Decimal constants */

/* Function prototypes */

static int      prog ();
static int      prog_ (const int);

/* Main function */
int
main (int argc, char **argv)
{
  cob_init (argc, argv);
  cob_stop_run (prog ());
}

/* Functions */

/* PROGRAM-ID 'prog' */

/* ENTRY 'prog' */

static int
prog ()
{
  return prog_ (0);
}

static int
prog_ (const int entry)
{
  /* Program local variables */
  /* Generated by           cobc 2.2.0 */
    /* Generated from         prog.cob */
    /* Generated at           avril 23 2024 18:49:00 */
    /* GnuCOBOL build date    Apr 10 2024 16:39:16 */
    /* GnuCOBOL package date  Sep 06 2017 18:45:29 UTC */
    /* Compile command        /opt/gnucobol/gnucobol-2.2/bin/cobc -Cx -fno-computed-goto prog.cob */

    /* Program local variables for 'prog' */

    /* Module initialization indicator */
    static unsigned int initialized = 0;

    /* Module structure pointer */
    static cob_module   *module = NULL;

    /* Global variable pointer */
    cob_global      *cob_glob_ptr;

    /* Call parameters */
    cob_field       *cob_procedure_params[1];

    /* Perform frame stack */
    struct cob_frame    *frame_ptr;
    struct cob_frame    frame_stack[255];

    /* Data storage */
    static int  b_2;    /* RETURN-CODE */
    static cob_u8_t b_6[3] __attribute__((aligned));    /* SEPARATE1 */
    static cob_u8_t b_7[3] __attribute__((aligned));    /* SEPARATE2 */
    static cob_u8_t b_8[3] __attribute__((aligned));    /* TSEPARATE1 */
    static cob_u8_t b_9[3] __attribute__((aligned));    /* TSEPARATE2 */
    static cob_u8_t b_10[2] __attribute__((aligned));   /* NSEPARATE1 */
    static cob_u8_t b_11[2] __attribute__((aligned));   /* NSEPARATE2 */
    static cob_u8_t b_12[2] __attribute__((aligned));   /* TRAILING1 */
    static cob_u8_t b_13[2] __attribute__((aligned));   /* TRAILING2 */
    static cob_u8_t b_14[4] __attribute__((aligned));   /* CNT */

    /* End of data storage */

    /* Fields */
    static cob_field f_6    = {3, b_6, &a_3};   /* SEPARATE1 */
    static cob_field f_7    = {3, b_7, &a_3};   /* SEPARATE2 */
    static cob_field f_8    = {3, b_8, &a_5};   /* TSEPARATE1 */
    static cob_field f_9    = {3, b_9, &a_5};   /* TSEPARATE2 */
    static cob_field f_10   = {2, b_10, &a_6};  /* NSEPARATE1 */
    static cob_field f_11   = {2, b_11, &a_6};  /* NSEPARATE2 */
    static cob_field f_12   = {2, b_12, &a_6};  /* TRAILING1 */
    static cob_field f_13   = {2, b_13, &a_6};  /* TRAILING2 */
    static cob_field f_14   = {4, b_14, &a_2};  /* CNT */

    /* End of fields */

  /* Start of function code */

  /* CANCEL callback */
  if (unlikely(entry < 0)) {
    if (entry == -20)
        goto P_clear_decimal;
    goto P_cancel;
  }

  /* Check initialized, check module allocated, */
  /* set global pointer, */
  /* push module stack, save call parameter count */
  if (cob_module_global_enter (&module, &cob_glob_ptr, 0, entry, 0))
    return -1;

  /* Set address of module parameter list */
  module->cob_procedure_params = cob_procedure_params;

  /* Set frame stack pointer */
  frame_ptr = frame_stack;
  frame_ptr->perform_through = 0;

  /* Initialize rest of program */
  if (unlikely(initialized == 0)) {
    goto P_initialize;
  }
  P_ret_initialize:

  /* Increment module active */
  module->module_active++;

  /* Entry dispatch */
  goto l_2;

  /* PROCEDURE DIVISION */

  /* Line: 15        : Entry     prog                    : prog.cob */
  l_2:;

  /* Line: 15        : MOVE               : prog.cob */
  cob_move ((cob_field *)&c_1, &f_14);

  /* Line: 16        : INSPECT            : prog.cob */
  cob_inspect_init (&f_6, 0);
  cob_inspect_start ();
  cob_inspect_all (&f_14, (cob_field *)&c_2);
  cob_inspect_finish ();

  /* Line: 17        : IF                 : prog.cob */
  if (((int)cob_cmp_s32 (b_14, 2LL) != 0))
  {

    /* Line: 18        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_3, &f_14);
  }

  /* Line: 19        : MOVE               : prog.cob */
  cob_move ((cob_field *)&c_1, &f_14);

  /* Line: 20        : INSPECT            : prog.cob */
  cob_inspect_init (&f_7, 0);
  cob_inspect_start ();
  cob_inspect_all (&f_14, (cob_field *)&c_2);
  cob_inspect_finish ();

  /* Line: 21        : IF                 : prog.cob */
  if (((int)cob_cmp_s32 (b_14, 2LL) != 0))
  {

    /* Line: 22        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_4, &f_14);
  }

  /* Line: 23        : MOVE               : prog.cob */
  cob_move ((cob_field *)&c_1, &f_14);

  /* Line: 24        : INSPECT            : prog.cob */
  cob_inspect_init (&f_8, 0);
  cob_inspect_start ();
  cob_inspect_all (&f_14, (cob_field *)&c_2);
  cob_inspect_finish ();

  /* Line: 25        : IF                 : prog.cob */
  if (((int)cob_cmp_s32 (b_14, 2LL) != 0))
  {

    /* Line: 26        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_5, &f_14);
  }

  /* Line: 27        : MOVE               : prog.cob */
  cob_move ((cob_field *)&c_1, &f_14);

  /* Line: 28        : INSPECT            : prog.cob */
  cob_inspect_init (&f_9, 0);
  cob_inspect_start ();
  cob_inspect_all (&f_14, (cob_field *)&c_2);
  cob_inspect_finish ();

  /* Line: 29        : IF                 : prog.cob */
  if (((int)cob_cmp_s32 (b_14, 2LL) != 0))
  {

    /* Line: 30        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_6, &f_14);
  }

  /* Line: 31        : MOVE               : prog.cob */
  cob_move ((cob_field *)&c_1, &f_14);

  /* Line: 32        : INSPECT            : prog.cob */
  cob_inspect_init (&f_10, 0);
  cob_inspect_start ();
  cob_inspect_all (&f_14, (cob_field *)&c_2);
  cob_inspect_finish ();

  /* Line: 33        : IF                 : prog.cob */
  if (((int)cob_cmp_s32 (b_14, 2LL) != 0))
  {

    /* Line: 34        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_7, &f_14);
  }

  /* Line: 35        : MOVE               : prog.cob */
  cob_move ((cob_field *)&c_1, &f_14);

  /* Line: 36        : INSPECT            : prog.cob */
  cob_inspect_init (&f_11, 0);
  cob_inspect_start ();
  cob_inspect_all (&f_14, (cob_field *)&c_2);
  cob_inspect_finish ();

  /* Line: 37        : IF                 : prog.cob */
  if (((int)cob_cmp_s32 (b_14, 2LL) != 0))
  {

    /* Line: 38        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_8, &f_14);
  }

  /* Line: 39        : MOVE               : prog.cob */
  cob_move ((cob_field *)&c_1, &f_14);

  /* Line: 40        : INSPECT            : prog.cob */
  cob_inspect_init (&f_12, 0);
  cob_inspect_start ();
  cob_inspect_all (&f_14, (cob_field *)&c_2);
  cob_inspect_finish ();

  /* Line: 41        : IF                 : prog.cob */
  if (((int)cob_cmp_s32 (b_14, 2LL) != 0))
  {

    /* Line: 42        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_9, &f_14);
  }

  /* Line: 43        : MOVE               : prog.cob */
  cob_move ((cob_field *)&c_1, &f_14);

  /* Line: 44        : INSPECT            : prog.cob */
  cob_inspect_init (&f_13, 0);
  cob_inspect_start ();
  cob_inspect_all (&f_14, (cob_field *)&c_2);
  cob_inspect_finish ();

  /* Line: 45        : IF                 : prog.cob */
  if (((int)cob_cmp_s32 (b_14, 2LL) != 0))
  {

    /* Line: 46        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_10, &f_14);
  }

  /* Line: 48        : INSPECT            : prog.cob */
  cob_inspect_init (&f_6, 0);
  cob_inspect_start ();
  cob_inspect_converting ((cob_field *)&c_11, (cob_field *)&c_12);
  cob_inspect_finish ();

  /* Line: 49        : IF                 : prog.cob */
  if (((int)cob_cmp_llint (&f_6, -22LL) != 0))
  {

    /* Line: 50        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_13, &f_6);

    /* Line: 51        : MOVE               : prog.cob */
    cob_move ((cob_field *)&c_14, &f_6);
  }

  /* Line: 52        : INSPECT            : prog.cob */
  cob_inspect_init (&f_7, 0);
  cob_inspect_start ();
  cob_inspect_converting ((cob_field *)&c_11, (cob_field *)&c_12);
  cob_inspect_finish ();

  /* Line: 53        : IF                 : prog.cob */
  if (((int)cob_cmp_llint (&f_7, 22LL) != 0))
  {

    /* Line: 54        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_15, &f_7);

    /* Line: 55        : MOVE               : prog.cob */
    cob_move ((cob_field *)&c_16, &f_7);
  }

  /* Line: 56        : INSPECT            : prog.cob */
  cob_inspect_init (&f_8, 0);
  cob_inspect_start ();
  cob_inspect_converting ((cob_field *)&c_11, (cob_field *)&c_12);
  cob_inspect_finish ();

  /* Line: 57        : IF                 : prog.cob */
  if (((int)cob_cmp_llint (&f_8, -22LL) != 0))
  {

    /* Line: 58        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_17, &f_8);

    /* Line: 59        : MOVE               : prog.cob */
    cob_move ((cob_field *)&c_14, &f_8);
  }

  /* Line: 60        : INSPECT            : prog.cob */
  cob_inspect_init (&f_9, 0);
  cob_inspect_start ();
  cob_inspect_converting ((cob_field *)&c_11, (cob_field *)&c_12);
  cob_inspect_finish ();

  /* Line: 61        : IF                 : prog.cob */
  if (((int)cob_cmp_llint (&f_9, 22LL) != 0))
  {

    /* Line: 62        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_18, &f_9);

    /* Line: 63        : MOVE               : prog.cob */
    cob_move ((cob_field *)&c_16, &f_9);
  }

  /* Line: 64        : INSPECT            : prog.cob */
  cob_inspect_init (&f_10, 0);
  cob_inspect_start ();
  cob_inspect_converting ((cob_field *)&c_11, (cob_field *)&c_12);
  cob_inspect_finish ();

  /* Line: 65        : IF                 : prog.cob */
  if (((int)cob_cmp_numdisp (b_10, 2, -22LL, 1) != 0))
  {

    /* Line: 66        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_19, &f_10);

    /* Line: 67        : MOVE               : prog.cob */
    memcpy (b_10, "2r", 2);
  }

  /* Line: 68        : INSPECT            : prog.cob */
  cob_inspect_init (&f_11, 0);
  cob_inspect_start ();
  cob_inspect_converting ((cob_field *)&c_11, (cob_field *)&c_12);
  cob_inspect_finish ();

  /* Line: 69        : IF                 : prog.cob */
  if (((int)cob_cmp_numdisp (b_11, 2, 22LL, 1) != 0))
  {

    /* Line: 70        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_20, &f_11);

    /* Line: 71        : MOVE               : prog.cob */
    memset (b_11, 50, 2);
  }

  /* Line: 72        : INSPECT            : prog.cob */
  cob_inspect_init (&f_12, 0);
  cob_inspect_start ();
  cob_inspect_converting ((cob_field *)&c_11, (cob_field *)&c_12);
  cob_inspect_finish ();

  /* Line: 73        : IF                 : prog.cob */
  if (((int)cob_cmp_numdisp (b_12, 2, -22LL, 1) != 0))
  {

    /* Line: 74        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_21, &f_12);

    /* Line: 75        : MOVE               : prog.cob */
    memcpy (b_12, "2r", 2);
  }

  /* Line: 76        : INSPECT            : prog.cob */
  cob_inspect_init (&f_13, 0);
  cob_inspect_start ();
  cob_inspect_converting ((cob_field *)&c_11, (cob_field *)&c_12);
  cob_inspect_finish ();

  /* Line: 77        : IF                 : prog.cob */
  if (((int)cob_cmp_numdisp (b_13, 2, 22LL, 1) != 0))
  {

    /* Line: 78        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_22, &f_13);

    /* Line: 79        : MOVE               : prog.cob */
    memset (b_13, 50, 2);
  }

  /* Line: 81        : INSPECT            : prog.cob */
  cob_inspect_init (&f_6, 1);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_23, (cob_field *)&c_2);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_24, (cob_field *)&c_23);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_25, (cob_field *)&c_24);
  cob_inspect_finish ();

  /* Line: 84        : IF                 : prog.cob */
  if (((int)cob_cmp_llint (&f_6, -33LL) != 0))
  {

    /* Line: 85        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_26, &f_6);

    /* Line: 86        : MOVE               : prog.cob */
    cob_move ((cob_field *)&c_27, &f_6);
  }

  /* Line: 87        : INSPECT            : prog.cob */
  cob_inspect_init (&f_7, 1);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_23, (cob_field *)&c_2);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_24, (cob_field *)&c_23);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_25, (cob_field *)&c_24);
  cob_inspect_finish ();

  /* Line: 90        : IF                 : prog.cob */
  if (((int)cob_cmp_llint (&f_7, 33LL) != 0))
  {

    /* Line: 91        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_28, &f_7);

    /* Line: 92        : MOVE               : prog.cob */
    cob_move ((cob_field *)&c_29, &f_7);
  }

  /* Line: 93        : INSPECT            : prog.cob */
  cob_inspect_init (&f_8, 1);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_23, (cob_field *)&c_2);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_24, (cob_field *)&c_23);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_25, (cob_field *)&c_24);
  cob_inspect_finish ();

  /* Line: 96        : IF                 : prog.cob */
  if (((int)cob_cmp_llint (&f_8, -33LL) != 0))
  {

    /* Line: 97        : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_30, &f_8);

    /* Line: 98        : MOVE               : prog.cob */
    cob_move ((cob_field *)&c_27, &f_8);
  }

  /* Line: 99        : INSPECT            : prog.cob */
  cob_inspect_init (&f_9, 1);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_23, (cob_field *)&c_2);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_24, (cob_field *)&c_23);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_25, (cob_field *)&c_24);
  cob_inspect_finish ();

  /* Line: 102       : IF                 : prog.cob */
  if (((int)cob_cmp_llint (&f_9, 33LL) != 0))
  {

    /* Line: 103       : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_31, &f_9);

    /* Line: 104       : MOVE               : prog.cob */
    cob_move ((cob_field *)&c_29, &f_9);
  }

  /* Line: 105       : INSPECT            : prog.cob */
  cob_inspect_init (&f_10, 1);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_23, (cob_field *)&c_2);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_24, (cob_field *)&c_23);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_25, (cob_field *)&c_24);
  cob_inspect_finish ();

  /* Line: 108       : IF                 : prog.cob */
  if (((int)cob_cmp_numdisp (b_10, 2, -33LL, 1) != 0))
  {

    /* Line: 109       : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_32, &f_10);

    /* Line: 110       : MOVE               : prog.cob */
    memcpy (b_10, "3s", 2);
  }

  /* Line: 111       : INSPECT            : prog.cob */
  cob_inspect_init (&f_11, 1);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_23, (cob_field *)&c_2);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_24, (cob_field *)&c_23);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_25, (cob_field *)&c_24);
  cob_inspect_finish ();

  /* Line: 114       : IF                 : prog.cob */
  if (((int)cob_cmp_numdisp (b_11, 2, 33LL, 1) != 0))
  {

    /* Line: 115       : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_33, &f_11);

    /* Line: 116       : MOVE               : prog.cob */
    memset (b_11, 51, 2);
  }

  /* Line: 117       : INSPECT            : prog.cob */
  cob_inspect_init (&f_12, 1);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_23, (cob_field *)&c_2);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_24, (cob_field *)&c_23);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_25, (cob_field *)&c_24);
  cob_inspect_finish ();

  /* Line: 120       : IF                 : prog.cob */
  if (((int)cob_cmp_numdisp (b_12, 2, -33LL, 1) != 0))
  {

    /* Line: 121       : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_34, &f_12);

    /* Line: 122       : MOVE               : prog.cob */
    memcpy (b_12, "3s", 2);
  }

  /* Line: 123       : INSPECT            : prog.cob */
  cob_inspect_init (&f_13, 1);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_23, (cob_field *)&c_2);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_24, (cob_field *)&c_23);
  cob_inspect_start ();
  cob_inspect_all ((cob_field *)&c_25, (cob_field *)&c_24);
  cob_inspect_finish ();

  /* Line: 126       : IF                 : prog.cob */
  if (((int)cob_cmp_numdisp (b_13, 2, 33LL, 1) != 0))
  {

    /* Line: 127       : DISPLAY            : prog.cob */
    cob_display (0, 1, 2, &c_35, &f_13);

    /* Line: 128       : MOVE               : prog.cob */
    memset (b_13, 51, 2);
  }

  /* Line: 130       : STOP RUN           : prog.cob */
  cob_stop_run (b_2);

  /* Program exit */

  /* Decrement module active count */
  if (module->module_active) {
    module->module_active--;
  }

  /* Pop module stack */
  cob_module_leave (module);

  /* Program return */
  return b_2;

  /* Frame stack jump table */
  P_switch:
   cob_fatal_error (COB_FERROR_CODEGEN);

  /* Program initialization */
  P_initialize:

  cob_check_version (COB_SOURCE_FILE, COB_PACKAGE_VERSION, COB_PATCH_LEVEL);

  cob_module_path = cob_glob_ptr->cob_main_argv0;

  /* Initialize module structure */
  module->module_name = "prog";
  module->module_formatted_date = COB_MODULE_FORMATTED_DATE;
  module->module_source = COB_SOURCE_FILE;
  module->module_entry.funcptr = (void *(*)())prog;
  module->module_cancel.funcptr = (void *(*)())prog_;
  module->collating_sequence = NULL;
  module->crt_status = NULL;
  module->cursor_pos = NULL;
  module->module_ref_count = NULL;
  module->module_path = &cob_module_path;
  module->module_active = 0;
  module->module_date = COB_MODULE_DATE;
  module->module_time = COB_MODULE_TIME;
  module->module_type = 0;
  module->module_param_cnt = 0;
  module->module_returning = 0;
  module->ebcdic_sign = 0;
  module->decimal_point = '.';
  module->currency_symbol = '$';
  module->numeric_separator = ',';
  module->flag_filename_mapping = 1;
  module->flag_binary_truncate = 1;
  module->flag_pretty_display = 1;
  module->flag_host_sign = 0;
  module->flag_no_phys_canc = 1;
  module->flag_main = 1;
  module->flag_fold_call = 0;
  module->flag_exit_program = 0;

  /* Initialize cancel callback */
  cob_set_cancel (module);

  /* Initialize WORKING-STORAGE */
  b_2 = 0;
  cob_move ((cob_field *)&c_36, &f_6);
  cob_move ((cob_field *)&c_37, &f_7);
  cob_move ((cob_field *)&c_36, &f_8);
  cob_move ((cob_field *)&c_37, &f_9);
  memcpy (b_10, "1q", 2);
  memset (b_11, 49, 2);
  memcpy (b_12, "1q", 2);
  memset (b_13, 49, 2);
  memset (b_14, 0, 4);

  initialized = 1;
  goto P_ret_initialize;

  /* CANCEL callback handling */
  P_cancel:

  if (!initialized) {
    return 0;
  }
  if (module->module_active) {
    cob_fatal_error (COB_FERROR_CANCEL);
  }

  initialized = 0;

  P_clear_decimal:

  return 0;

}

/* End PROGRAM-ID 'prog' */

/* End functions */

I get the following error with make check :

## ---------------------- ##
## Detailed failed tests. ##
## ---------------------- ##

#                             -*- compilation -*-
9. backcomp.at:3270: testing INSPECT numeric signed ...
./backcomp.at:4037: $COMPILE prog.c
--- /dev/null   2024-04-23 10:34:00.223999454 +0200
+++ /home/engboris/gnucobol/tests/testsuite.dir/at-groups/9/stderr  2024-04-23 18:51:33.849645850 +0200
@@ -0,0 +1,11 @@
+prog.c: In function 'cob_cmp_s32':
+prog.c:92:26: error: expected ')' before '__unaligned'
+   92 |         val = *(const int __unaligned *)p;
+      |                ~         ^~~~~~~~~~~~
+      |                          )
+prog.c:92:16: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
+   92 |         val = *(const int __unaligned *)p;
+      |                ^
+prog.c:92:15: error: invalid type argument of unary '*' (have 'int')
+   92 |         val = *(const int __unaligned *)p;
+      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~
./backcomp.at:4037: exit code was 1, expected 0
9. backcomp.at:3270: 9. INSPECT numeric signed (backcomp.at:3270): FAILED (backcomp.at:4037)

All other tests I mentioned produce the same kind of error. It is likely that I am the one who broke backward compatibility with my current changes so I just marked problematic tests before further investigations. However, I was surprised by these errors which seem to be related to the C code itself, independently of my changes. There might be something I'm missing somwhere...

GitMensch commented 3 months ago

I'd suggest to do the following changes:

engboris commented 3 months ago

f_14 is of type static cob_field. Is it really what you meant? I get the following error:

+prog.c: In function 'prog_':
+prog.c:222:20: error: incompatible type for argument 1 of 'cob_cmp_int'
+  222 |   if (cob_cmp_int (f_14, 2) != 0)
+      |                    ^~~~
+      |                    |
+      |                    cob_field {aka struct __cob_field}
+In file included from /home/engboris/gnucobol/libcob.h:29,
+                 from prog.c:11:
+/home/engboris/gnucobol/libcob/common.h:2126:42: note: expected 'cob_field *' {aka 'struct __cob_field *'} but argument is of type 'cob_field' {aka 'struct __cob_field'}
+ 2126 | COB_EXPIMP int  cob_cmp_int             (cob_field *, const int);
+      |                                          ^~~~~~~~~~~
+prog.c:239:20: error: incompatible type for argument 1 of 'cob_cmp_int'
+  239 |   if (cob_cmp_int (f_14, 2) != 0)
+      |                    ^~~~
+      |                    |
+      |                    cob_field {aka struct __cob_field}
GitMensch commented 3 months ago

No, that's of course &f_14

engboris commented 3 months ago

It works. Do you have an explanation for that?

GitMensch commented 3 months ago

I don't understand the question.

engboris commented 3 months ago

I'm not sure to understand why this fix is working. Does GnuCOBOL 2.2 generate code which is not strictly correct for newer versions, meaning we need to use functions like cob_cmp_int which are valid for GnuCOBOL >=2.2? (Then this would be a minor break of backward compatibility?)

There's also INSPECT CONVERTING alphabet in run_misc.at which is compiled with 3.1 but I assume there's no support for that in 2.2 as well:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. charset.

       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       SPECIAL-NAMES.
           ALPHABET ALPHA IS ASCII.
           ALPHABET BETA  IS EBCDIC.

       DATA DIVISION.
       WORKING-STORAGE SECTION.

       01 TESTHEX PIC X(10) VALUE X'C17BD6F2F0F1F8406A5A'.

       procedure division.
       sample-main.

           INSPECT testhex CONVERTING BETA TO ALPHA
           DISPLAY 'Converted: "' TESTHEX '"' WITH NO ADVANCING

           GOBACK.
       END PROGRAM charset.

I get

prog.cob: in paragraph 'sample-main':
prog.cob: 18: error: 'BETA' is not a field
prog.cob: 18: error: 'ALPHA' is not a field
GitMensch commented 3 months ago

Does GnuCOBOL 2.2 generate code which is not strictly correct for newer versions, meaning we need to use functions like cob_cmp_int which are valid for GnuCOBOL >=2.2?

Those are inlined functions, so they are compiled back then and may conflict with newer ones, but that doesn't matter, as those are inline. Note: this happens for "optimized" functions, the "general" ones should always work (if they aren't added new). The important part are the external ones, in this case for "string" handling. Using these general functions has also the benefit that we ensure backward compatibility there as well and that the test sources are smaller. Please apply such changes (static cmp/get/set functions) when including generated C source.

There's also INSPECT CONVERTING alphabet in run_misc.at which is compiled with 3.1 but I assume there's no support for that in 2.2 as well

Hm, that's an old option, maybe using defined alphabets doesn't work? Please try INSPECT testhex CONVERTING EBCDIC TO ASCII,m if that doesn't work then compile it with 3.1 and use that as "base source".

engboris commented 3 months ago

Please try INSPECT testhex CONVERTING EBCDIC TO ASCII,m if that doesn't work then compile it with 3.1 and use that as "base source".

INSPECT testhex CONVERTING EBCDIC TO ASCII doesn't work for neither version of GnuCOBOL. I obtain variants of this error:

prog.cob: in paragraph 'sample-main':
prog.cob:15: error: 'EBCDIC' cannot be used here
prog.cob:15: error: 'ASCII' cannot be used here
engboris commented 3 months ago

I just noticed that on test STRING / UNSTRING [[NOT]] ON OVERFLOW in run_misc.at, I had a different error when using make check with the program compiled with GnuCOBOL 2.2:

17. backcomp.at:6258: testing STRING / UNSTRING [NOT] ON OVERFLOW ...
./backcomp.at:6694: $COMPILE prog.c
./backcomp.at:6695: $COBCRUN_DIRECT ./prog
--- /dev/null   2024-04-25 10:37:11.971991673 +0200
+++ /home/engboris/gnucobol/tests/testsuite.dir/at-groups/17/stderr 2024-04-25 14:38:16.894612997 +0200
@@ -0,0 +1 @@
+missing OVERFLOW                                  
--- -   2024-04-25 14:38:16.898018147 +0200
+++ /home/engboris/gnucobol/tests/testsuite.dir/at-groups/17/stdout 2024-04-25 14:38:16.894612997 +0200
@@ -1,5 +1,7 @@
 1 passed
-2 passed
+STRING ERROR (1): "STRING OVERFLOW     "
+2 failed
+STRING ERROR (2): "missing OVERFLOW    "
 3 passed
 4 passed

17. backcomp.at:6258: 17. STRING / UNSTRING [NOT] ON OVERFLOW (backcomp.at:6258): FAILED (backcomp.at:6695)

It works with >=3.1 though.

engboris commented 3 months ago

please change the the states to be anonymous forward struct definitions instead of exposing the structure "externally" (we want only libcob to adjust it).

Could you tell me how I can do that? I see what a forward definition and an anonymous definition are but I'm not sure to see what an anonymous forward definition is.

GitMensch commented 3 months ago

Just define the type itself without the details in common.h, and the details alone in strings.c where we want to operate on this struct. That does work (as we only move a pointer to it around), doesn't it?

engboris commented 1 month ago

I've made all minor changes. I had to extend output_data in order to handle CB_TAG_DIRECT.

minor adjustment to generated tests/atlocal and related call [see VGSUFFIX in that file]

I don't see what you mean here. It is possible to give more information? I know how to use valgrind but it's my first time interacting with memory check in GnuCOBOL.

GitMensch commented 1 month ago

minor adjustment to generated tests/atlocal and related call [see VGSUFFIX in that file]

I don't see what you mean here. It is possible to give more information? I know how to use valgrind but it's my first time interacting with memory check in GnuCOBOL.

Of course. To run the test through valgrind you adjust the generated tests/atlocal https://github.com/OCamlPro/gnucobol/blob/ea648a2e2668b3bb704c346f17ec163afb8e61d4/tests/atlocal.in#L153-L155 enabling those lines if you have the files and add possibly more related suppression rules.

And because cobc has not any new "general code" in this PR, you can speed up that testrun (and increase the runtime output usefulness) by enabling line 166, commenting out the two below https://github.com/OCamlPro/gnucobol/blob/ea648a2e2668b3bb704c346f17ec163afb8e61d4/tests/atlocal.in#L166-L168

Then you run the testsuite (internal should be enough here) with make check TESTSUITEFLAGS="--jobs=8" VGSUFFIX=new, this leads to the tests be run through valgrind - I commonly order the files in tests/valgrind by size then inspecting the big ones (you shouldn't actually see ones).

As this is relative resource heavy I only do that shortly before a (pre)release and commonly when changing more global parts.

engboris commented 1 month ago

minor adjustment to generated tests/atlocal and related call [see VGSUFFIX in that file]

I don't see what you mean here. It is possible to give more information? I know how to use valgrind but it's my first time interacting with memory check in GnuCOBOL.

Of course. To run the test through valgrind you adjust the generated tests/atlocal

https://github.com/OCamlPro/gnucobol/blob/ea648a2e2668b3bb704c346f17ec163afb8e61d4/tests/atlocal.in#L153-L155 enabling those lines if you have the files and add possibly more related suppression rules.

And because cobc has not any new "general code" in this PR, you can speed up that testrun (and increase the runtime output usefulness) by enabling line 166, commenting out the two below

https://github.com/OCamlPro/gnucobol/blob/ea648a2e2668b3bb704c346f17ec163afb8e61d4/tests/atlocal.in#L166-L168

Then you run the testsuite (internal should be enough here) with make check TESTSUITEFLAGS="--jobs=8" VGSUFFIX=new, this leads to the tests be run through valgrind - I commonly order the files in tests/valgrind by size then inspecting the big ones (you shouldn't actually see ones).

As this is relative resource heavy I only do that shortly before a (pre)release and commonly when changing more global parts.

I get this (only) error after running the test suite:

#                             -*- compilation -*-
498. listings.at:2038: testing command line ...
../../tests/listings.at:2055: $COBC $LISTING_FLAGS -q -fsyntax-only -t- -fno-theader -ftcmd prog.cob
--- -   2024-07-02 11:24:52.472207550 +0200
+++ /home/engboris/gnucobol/_build/tests/testsuite.dir/at-groups/498/stdout 2024-07-02 11:24:52.464010427 +0200
@@ -12,7 +12,7 @@
 000011         END FUNCTION WITHPAR.
 
 command line:
-  cobc -fttitle=GnuCOBOL_V.R.P -fno-ttimestamp -q -fsyntax-only -t-
+  cobc -g -fttitle=GnuCOBOL_V.R.P -fno-ttimestamp -q -fsyntax-only -t-
 + -fno-theader -ftcmd prog.cob
 0 warnings in compilation group
 0 errors in compilation group
prog.lst:
sed: can't read prog.lst: No such file or directory
498. listings.at:2038: 498. command line (listings.at:2038): FAILED (listings.at:2055)
GitMensch commented 1 month ago

I get this (only) error after running the test suite:

#                             -*- compilation -*-
498. listings.at:2038: testing command line ...
../../tests/listings.at:2055: $COBC $LISTING_FLAGS -q -fsyntax-only -t- -fno-theader -ftcmd prog.cob
--- - 2024-07-02 11:24:52.472207550 +0200
+++ /home/engboris/gnucobol/_build/tests/testsuite.dir/at-groups/498/stdout   2024-07-02 11:24:52.464010427 +0200
@@ -12,7 +12,7 @@
 000011         END FUNCTION WITHPAR.
 
 command line:
-  cobc -fttitle=GnuCOBOL_V.R.P -fno-ttimestamp -q -fsyntax-only -t-
+  cobc -g -fttitle=GnuCOBOL_V.R.P -fno-ttimestamp -q -fsyntax-only -t-
 + -fno-theader -ftcmd prog.cob
 0 warnings in compilation group
 0 errors in compilation group
prog.lst:
sed: can't read prog.lst: No such file or directory
498. listings.at:2038: 498. command line (listings.at:2038): FAILED (listings.at:2055)

OK, that's expected. Now check for valgrind errors (subfolder tests/valgrind, you have one file per run and can order those by size)

engboris commented 1 month ago

I have 384 files of size >650k, 200 ones of size between 50k and 180k. All other ones are smaller with most files with size <700.

GitMensch commented 1 month ago

I have 384 files of size >650k, 200 ones of size between 50k and 180k. All other ones are smaller with most files with size <700.

That seems way too big, check their content (if they are dlopen or bdb related you possible find suppression rules on your system to not need ignoring them), double-check for use of system installed suppression rules and then check the content of the big ones to see where this PR may produce leaks.

engboris commented 1 month ago

I'm not sure if I'm doing it right but it looks like I have similar results on the gcos4gnucobol-3.x branch.

GitMensch commented 1 month ago

Did you check the log files to see if this is related to your changes?

Mind to upload the resulting tests/valgrind folder as .zip?

engboris commented 1 month ago

At a first glance, I didn't see things related to my changes in the biggest files. That's why I wanted to try on the main branch. But I did notice a few parts related to strings so I will keep investigating memory leaks related to my changes later.

Here the archive for gcos4gnucobol-3.x (it is possible that I did something wrong!): new.zip

GitMensch commented 1 month ago

Checking the biggest file shows us:

==12282== Memcheck, a memory error detector
==12282== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12282== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==12282== Command: /home/engboris/gnucobol/_build/bin//cobcrun -q -M ThisIsANameThatIsReallyNotAllowedAsProgIdInCOBOL noprog
==12282== Parent PID: 12257
==12282== 
==12282== 
==12282== HEAP SUMMARY:
==12282==     in use at exit: 113,129 bytes in 2,686 blocks
==12282==   total heap usage: 4,016 allocs, 1,330 frees, 192,842 bytes allocated
==12282== 
==12282== 1 bytes in 1 blocks are still reachable in loss record 1 of 937
==12282==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==12282==    by 0x1A98D9: set_lang (in /usr/bin/bash)
==12282==    by 0x139F79: main (in /usr/bin/bash)

So it is actually checking the wrapper, which is not that useful. While this doesn't work in general - please also adjust atlocal to set COBCRUN to bin/.libs/cobcrun, this should provide you with more reasonable files when re-running the testsuite now using make check TESTSUITEFLAGS="--jobs=8" VGSUFFIX=newtest.

engboris commented 1 month ago

The sizes I have for gcos4gnucobol-3.x :

83593, 83593, 80877, 69232, 50756, 50756, 50546, 50546, 12152, 9164, 5855, 5814, 5814, 5814, 5814, 5811, 5168, 4096, 3686, 3150, 1992, 1950, 1950, 1943, 1921, 1547, 1498, 1463, 1389, 1388, 1388, 674, 673, 671

And for the branch containing my changes:

84699, 82325, 81942, 70221, 51438, 51438, 51228, 51228, 12312, 9302, 5943, 5902, 5902, 5902, 5902, 5899, 5264, 4069, 4069, 3743, 3611, 3217, 3201, 3147, 3147, 2999, 2998, 2994, 2989, 2982, 2982, 2818, 2811, 2809, 2309, 2309, 2309, 2309, 2309, 2309, 2309, 2309, 2309, 2267

The biggest files do not mention strings.h but mostly cob_screen. There are some files mentionning either still reachable, definitively lost or indirectly lost for strings.c. It is mostly because of unfreed strings states in libcob/strings.c or a few variables (mark, repdata, ...). If I try to add more cob_free (e.g. cof_free (st)), I fail backcomp tests (free(): invalid pointer/size).

GitMensch commented 3 weeks ago

Did you proceed working on that?

If you don't have an idea how to solve possible leaks you could also upload a zip with the valgrind reports that are related to strings.c and also request a review from @ddeclerck and/or @nberth.

engboris commented 3 weeks ago

Not yet. I'll check that later and I will probably request a review if needed. I just wanted to make sure the sizes were ok.

GitMensch commented 3 weeks ago

friendly ping again

engboris commented 3 weeks ago

I tried something. I get smaller sizes but now (oddly enough, even smaller than the gcos4gnucobol branch...). When I do manual tests with valgrind, I still get minor still reachable blocks. I'm not sure if it is of any relevance.

GitMensch commented 3 weeks ago

The changes look reasonable. Please provide your manual test program and the valgrind result file.

engboris commented 3 weeks ago

I have this toy program I use for strings. It uses all string-related statements but it is probably too trivial: strings.txt

With valgrind --leak-check=full --show-leak-kinds=all ./strings, I get:

==87886== Memcheck, a memory error detector
==87886== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==87886== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==87886== Command: ./strings
==87886==
==87886==
==87886== HEAP SUMMARY:
==87886==     in use at exit: 502 bytes in 6 blocks
==87886==   total heap usage: 573 allocs, 567 frees, 253,146 bytes allocated
==87886==
==87886== 31 bytes in 1 blocks are still reachable in loss record 1 of 6
==87886==    at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==87886==    by 0x48741BD: cob_malloc (common.c:2433)
==87886==    by 0x489BF26: cob_inspect_init_r (strings.c:550)
==87886==    by 0x10B053: strings_ (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A5B8: strings (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A576: main (in /home/engboris/test-cobol/strings)
==87886==
==87886== 31 bytes in 1 blocks are still reachable in loss record 2 of 6
==87886==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==87886==    by 0x48742E7: cob_fast_malloc (common.c:2490)
==87886==    by 0x489BA97: setup_repdata (strings.c:202)
==87886==    by 0x489BA97: do_mark (strings.c:371)
==87886==    by 0x489BA97: inspect_common_replacing (strings.c:429)
==87886==    by 0x489BD0D: inspect_common (strings.c:485)
==87886==    by 0x489C8C6: cob_inspect_all_r (strings.c:692)
==87886==    by 0x10B09B: strings_ (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A5B8: strings (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A576: main (in /home/engboris/test-cobol/strings)
==87886==
==87886== 32 bytes in 1 blocks are still reachable in loss record 3 of 6
==87886==    at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==87886==    by 0x48741BD: cob_malloc (common.c:2433)
==87886==    by 0x489D248: cob_string_init_r (strings.c:939)
==87886==    by 0x10B320: strings_ (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A5B8: strings (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A576: main (in /home/engboris/test-cobol/strings)
==87886==
==87886== 48 bytes in 1 blocks are still reachable in loss record 4 of 6
==87886==    at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==87886==    by 0x48741BD: cob_malloc (common.c:2433)
==87886==    by 0x489D5B4: cob_unstring_init_r (strings.c:1052)
==87886==    by 0x10B487: strings_ (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A5B8: strings (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A576: main (in /home/engboris/test-cobol/strings)
==87886==
==87886== 104 bytes in 1 blocks are still reachable in loss record 5 of 6
==87886==    at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==87886==    by 0x48741BD: cob_malloc (common.c:2433)
==87886==    by 0x489BD55: cob_inspect_init_common_r (strings.c:508)
==87886==    by 0x489BD55: cob_inspect_init_r (strings.c:536)
==87886==    by 0x10A6D7: strings_ (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A5B8: strings (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A576: main (in /home/engboris/test-cobol/strings)
==87886==
==87886== 256 bytes in 1 blocks are still reachable in loss record 6 of 6
==87886==    at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==87886==    by 0x48741BD: cob_malloc (common.c:2433)
==87886==    by 0x489D696: cob_unstring_init_r (strings.c:1074)
==87886==    by 0x10B487: strings_ (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A5B8: strings (in /home/engboris/test-cobol/strings)
==87886==    by 0x10A576: main (in /home/engboris/test-cobol/strings)
==87886==
==87886== LEAK SUMMARY:
==87886==    definitely lost: 0 bytes in 0 blocks
==87886==    indirectly lost: 0 bytes in 0 blocks
==87886==      possibly lost: 0 bytes in 0 blocks
==87886==    still reachable: 502 bytes in 6 blocks
==87886==         suppressed: 0 bytes in 0 blocks
==87886==
==87886== For lists of detected and suppressed errors, rerun with: -s
==87886== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Loss record 1/6 is

                /* libcob/strings.c:550 */
        if (st->mark) {
            cob_free (st->mark);
            st->mark_size = st->size;
        } else if (st->size < COB_NORMAL_BUFF) {
            st->mark_size = COB_NORMAL_BUFF;
        } else {
            st->mark_size = st->size;
        }
        /* initialize to zero */
        st->mark = cob_malloc (st->mark_size + 1); /* HERE */

Loss record 2/6 is

        /* libcob/strings.c:202 */
    if (st->size > st->repdata_size) {
        if (st->repdata) {
            cob_free (st->repdata);
            st->repdata_size = st->size;
        } else if (st->size < COB_NORMAL_BUFF) {
            st->repdata_size = COB_NORMAL_BUFF;
        } else {
            st->repdata_size = st->size;
        }
        /* data content does not matter as we only used marked positions at end */
        st->repdata = cob_fast_malloc (st->repdata_size + 1); /* HERE */
    }

Loss record 3/6 is

/* libcob/strings.c:939 */
void
cob_string_init_r (struct cob_string_state **pst, cob_field *dst, cob_field *ptr)
{
    struct cob_string_state *st;
    if (*pst == NULL)
        *pst = cob_malloc (sizeof(struct cob_string_state)); /* HERE */
    st = *pst;

etc, every is similar.

Fields are freed before being malloc'd and freed in _free functions called at the end of COBOL programs. States are also freed. I don't see what I'm missing...

GitMensch commented 2 weeks ago

When inspecting this I just recognized that we may have turned in the wrong direction at some place....

I'm currently concerned about memory usage (note: that's not related to the state itself - those are fine [and likely should just be static data similar to working-storage], but to the allocated data pointers).

Since quite a while GnuCOBOL allocates an increasing data pointer for those and re-uses it. So when you do an INSPECT with a 2GB field you allocate for that, and later on re-use that. Wouldn't the current implementation mean that if you do the 2GB INSPECT in 10 modules you already have 20GB allocated (until you cancel those modules, which in some environments is only done at program exit time)?

engboris commented 2 weeks ago

Well... Probably.

Wouldn't the current implementation mean that if you do the 2GB INSPECT in 10 modules you already have 20GB allocated (until you cancel those modules, which in some environments is only done at program exit time)?

Is it wrong? I'm not sure but I guess we don't want to share memory between modules neither free and realloc for each statement call. I'm not sure to see what's better.

GitMensch commented 2 weeks ago

When thinking about what would needed, then we get to thread local storage, because there can be only one statement active in each thread.

My current thoughts are:

#if defined(COB_TLS)
    /* already defined, for example as static to explicit disable TLS */
#elif defined(_WIN32)
    #define COB_TLS __declspec(thread)
#elif defined(__GNUC__) && (__GNUC__ >= 4) || defined(__clang__) || \
      defined(__hpux) || defined(_AIX) || defined(__sun)
    #define COB_TLS static __thread
#elif defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
    #include <threads.h>
    #define COB_TLS thread_local
#else
    #define COB_TLS static  /* fallback definition */
#endif

(please put the coblocal.h part in the Changelog as my change July 19th, so it is clear who had that bad idea in the future ;-)

~(maybe with the GCC part first)~

That's no decision (yet) and I haven't checked if this will work "in most environments", but I think that this should:

What do you think?

At a later date we should be able to add optional C11 threads / pthread support, possibly also through the gnulib module; if this is enabled, then we can register the cleanup function to be automatically done.

engboris commented 2 weeks ago

either:

  • change the _r functions to be static (rename from "_r" to "_intern") and always call those from the "normal ones", passing a pointer to the static structure; the reason for this is that accessing stack is possibly faster than accessing global/tls variables directly
  • or: drop the _r functions, moving its code back to the old ones, just operating on the global static versions now

I'm not sure to see the pros and cons of each proposition. If the direction is not clear and that C11 thread support is for later, maybe we can do things in several steps. We can keep and validate smaller improvements we're less likely to change and that we're sure of, in order to not compromise performances and such.

GitMensch commented 2 weeks ago

both options are valid, if not sure I'd go with the first version

engboris commented 2 weeks ago

I would go for the simplest for now which still significantly improves the previous code. Other things (which require more thinking on the design) may be postponed to later. Should I take a try or wait for a more formal decision from your end?

GitMensch commented 2 weeks ago

Please go on, taking the edited post above as formal decision. Thank you!

engboris commented 2 weeks ago

replace the "static" attribute with COB_TLS, newly defined in coblocal.h as

Where should I include this new attribute? Only for the _intern functions in libcob/string.c?

engboris commented 2 weeks ago

I believe it should be good if I didn't misunderstood what you meant

GitMensch commented 2 weeks ago

Any more memcheck outputs in string handling?

engboris commented 1 week ago

I have that "no leaks are possible" for my tests which is as expected since I removed all free calls I introduced.

GitMensch commented 1 week ago

Then we only need some cleanup and this can go upstream, right?

Things spotted for that:

engboris commented 1 week ago

Then we only need some cleanup and this can go upstream, right?

I think so. There's no especially sensitive changes as far as I understand. (Sorry for the Makefile, I've been confused with some conflicts/merges)

backcomp.at should be reduced to calls that are indeed different (there will still be some, but maybe? not all)

What do you mean?

GitMensch commented 1 week ago

backcomp.at was added as we needed to test the calls without the _r functions and the state - both are now not generated. I think that it is still useful to have the file in general and string related tests in there, because the codegen changed between libcob v2 and v3, but maybe it is enough to generate your shown sample code with 2.2, do necessary adjustments and insert it there to cover that (would need to inspect for other special test cases we may have in there already)

engboris commented 1 week ago

Are you saying that backcomp.at could simply include (for the moment) one single COBOL program covering most (possibly all) of strings functions?

GitMensch commented 1 week ago

yes, apart from this we only should have "special cases", like converting EBCDIC to ASCII, it also seems reasonable to include one test for collation compares as this was also changed - maybe we should add some more as well, but those three come to my mind now