OCamlPro / gnucobol

A clone of the sourceforge GnuCOBOL compiler from COBOL to C.
https://get-superbol.com
GNU Lesser General Public License v3.0
16 stars 20 forks source link

Fix decimal constant bug occuring when several programs in a COBOL file #113

Closed ddeclerck closed 6 months ago

ddeclerck commented 10 months ago

This patch fixes a nasty bug dealing with decimal constants when there are several programs in a COBOL file.

Consider the following COBOL file:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. prog.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  X  PIC 9(2).
       PROCEDURE DIVISION.
           CALL "prog2"
           IF X + 42 = 0
               DISPLAY "OK".
           STOP RUN.
       END PROGRAM prog.

       PROGRAM-ID. prog2 INITIAL.
       PROCEDURE DIVISION.
           EXIT PROGRAM.
       END PROGRAM prog2.

Attempting to run this program will result in a segfault:

prog.cob:8: attempt to reference invalid memory address (signal)

We found out the bug was occurring because returning from a subprogram clears all decimal constants, i.e the generated code for prog2 contains:

  P_clear_decimal:
  /* Clear Decimal Constant values */
  cob_decimal_clear (dc_1);
  dc_1 = NULL;

Hence when prog tries to use the constant 42 (through dc_1), we are greeted with a segfault since dc_1 is now NULL.

The proposed fix is to move all the decimal constants to the "local" header files instead of the "global" header. The testsuite has been run and does not complain.

codecov-commenter commented 10 months ago

Codecov Report

Merging #113 (dbaaf16) into gcos4gnucobol-3.x (c0d64ad) will increase coverage by 0.01%. The diff coverage is 100.00%.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

@@                  Coverage Diff                  @@
##           gcos4gnucobol-3.x     #113      +/-   ##
=====================================================
+ Coverage              65.74%   65.75%   +0.01%     
=====================================================
  Files                     32       32              
  Lines                  59092    59090       -2     
  Branches               15575    15572       -3     
=====================================================
+ Hits                   38849    38857       +8     
+ Misses                 14262    14254       -8     
+ Partials                5981     5979       -2     
Files Coverage Δ
cobc/codegen.c 76.06% <100.00%> (+0.07%) :arrow_up:

... and 3 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

GitMensch commented 10 months ago

That's a quite severe issue, please recheck if this happens with 3.1 and trunk, then create a bug issue with the details. Thank you.

I'll check the PR quite soon, likely tomorrow.

ddeclerck commented 10 months ago

Confirmed on 3.1 and trunk as well. Should I open the issue here or on SourceForge ?

GitMensch commented 10 months ago

On SF please, that's the right place for any bug report or feature request :-)

Thanks!

ddeclerck commented 9 months ago

Hi @GitMensch, Did you have a bit of time to check this ?

lefessan commented 9 months ago

The bug report is here: https://sourceforge.net/p/gnucobol/bugs/920/

ddeclerck commented 9 months ago

You mean here : https://sourceforge.net/p/gnucobol/bugs/917/ (920 is a different thing)

GitMensch commented 9 months ago

Without further checks I do wonder where this has gone bad, it seems this was fixed already? https://sourceforge.net/p/gnucobol/bugs/431/

ddeclerck commented 9 months ago

Without further checks I do wonder where this has gone bad, it seems this was fixed already? https://sourceforge.net/p/gnucobol/bugs/431/

I think this is a different issue. The one you mention is about decimal constants not being initialized at all. The issue we are facing right now is that when exiting a called module, every decimal constant are de-initialized, even those used by the calling module.

ddeclerck commented 9 months ago

Only minor changes to do for now, mostly to the testsuite.

Please be so kind to adjust it, then upload the generated code of that program with and without the codegen change, to ease rechecking the result after switching from "storage" to local.

That has been done.

Here is the generated code in both case (relevant changes in .h files) : prog_nofix.zip prog_fix.zip

GitMensch commented 9 months ago

Thank you for your work. I've adjusted the bug report with the findings - it isn't about INITIAL at all, but about CANCEL...

It seems reasonable to move the "INITIAL" from AT_SETUP to AT_KEYWORDS (also add CALL and CANCEL there), then drop INITIAL from prog.cpy and include that in the current places via REPLACING and finally add two other iterations of that copy using non-initial programs that are CANCELed after their CALL.

As far as I see your change will then still work fine - then please push that change to the testsuite here.

The second issue in both the old and new codegen: all constants from all runtime-elements (programs and user-defined functions) are initialized and clean in all runtime-elements, even if they aren't used at all (which is highlighted by your first test code where the constant was not used in prog2 and still got cleaned at its exit. This is only related to the issue tackled here and may be handled separately. I've another bug report for this, which could be solved first (and drop the make_decimal check in that loop, too.

Another related issue is, that we don't need multiple definitions in the first place. Adjusting that would also fix the bug at hand, but I currently see only two options and am not sure if those are good (both would leave the constant to be static and global): 1) move the init/cleanup to shared object loading/unloading (like output_so_load_version_check), but that would leave them active forever if COB_PHYSICAL_CANCEL is not used - and still needs a fallback if neither HAVE_ATTRIBUTE_CONSTRUCTOR nor WIN32 is set, the longer I think about that, the less I like that option myself 2) if the compile-unit has more than one program (otherwise there is no problem in the first place), then adjust the codegen to check that variable for NULL before doing the init and add a reference counter for each cob_decimal that is incremented on each place where the init is done, decrement on cleanup and only do real cleanup + setting to NULL if the reference counter is zero.

We could use the change as-is, which would fix the SIGSEGV with the downside of a bit increased cpu time and memory usage (common programs likely have 5 to 20 decimal constants).

What is your take on this?

ddeclerck commented 9 months ago

I updated the testsuite as requested (fix still works fine).

As for your suggestions:

  1. move the init/cleanup to shared object loading/unloading (like output_so_load_version_check), but that would leave them active forever if COB_PHYSICAL_CANCEL is not used - and still needs a fallback if neither HAVE_ATTRIBUTE_CONSTRUCTOR nor WIN32 is set, the longer I think about that, the less I like that option myself

I agree that this might not be the best option.

  1. if the compile-unit has more than one program (otherwise there is no problem in the first place), then adjust the codegen to check that variable for NULL before doing the init and add a reference counter for each cob_decimal that is incremented on each place where the init is done, decrement on cleanup and only do real cleanup + setting to NULL if the reference counter is zero.

Seems like a reasonable approach. It's probably best to tackle issue 923 first though. An optimal approach would be to add that counter only for those constants that appear in more than one program - but that might be overengineering...

GitMensch commented 9 months ago

I'd declare this PR as "ready for svn" (even when svn committing is still postponed) as soon as the Changelog entry is added.

Can you work on #923? In this case I'd suggest to postpone the change on

    if (CB_TREE_CLASS (m->x) == CB_CLASS_NUMERIC
     && m->make_decimal) {

there.

ddeclerck commented 9 months ago

I'll try. I'll add questions directly on the SVN issue if needed.

ddeclerck commented 9 months ago

Note: as suggested, I put back the following, since it is handled by 923.

if (CB_TREE_CLASS (m->x) == CB_CLASS_NUMERIC
     && m->make_decimal) {
GitMensch commented 6 months ago

@ddeclerck Please bring this and #115 to svn, as two commits in whatever order. Then reference the commits at the referenced issues on SF (not sure if you can close them there yourself, if yes, please do so).

GitMensch commented 6 months ago

now upstream - thanks!