OCamlPro / gnucobol

A clone of the sourceforge GnuCOBOL compiler from COBOL to C.
https://get-superbol.com
GNU Lesser General Public License v3.0
16 stars 21 forks source link

"Safe" partial replacement à la GCOS #70

Closed nberth closed 1 year ago

nberth commented 1 year ago

Partial replacement with literals only apply when it does not lead to an empty word, meaning that, e.g, REPLACE LEADING "X" BY "" (or SPACES) will apply on XX, but not on X. REPLACE LEADING "X" BY "Y" still applies as usual.

In addition, right operand of such a partial replacing must not contain any space or separator.

GitMensch commented 1 year ago

It is quite confusing to have a mixed merge of a PR and "upstream changes"; which are the commits to inspect for this PR?

nberth commented 1 year ago

It is quite confusing to have a mixed merge of a PR and "upstream changes"; which are the commits to inspect for this PR?

Apologies for the mishandling of branches. Github seemed a bit lost when I tried to update the PR against the proper base.

Anyways, in the end this PR is short (but still needs minor adjustments in print_replace). Documentation on GCOS behaviors can be found on page 542 at https://support.bull.com/ols/product/system/gcos7/gcos7-com/g7-dps7000/doc-com/docf/g/47X257TN30-aug2011/47A205UL04.pdf , sub-bullet "f" (REPLACING is similar). We have tested and validated this on a GCOS

GitMensch commented 1 year ago

That's currently marked as draft and I see that cobc.c listing code seems pending. Please ensure that you include a test case that currently fails and maybe explicit tell where this issue is.

codecov-commenter commented 1 year ago

Codecov Report

Merging #70 (74b3591) into gcos4gnucobol-3.x (5a1ed6f) will increase coverage by 0.01%. The diff coverage is 96.15%.

@@                  Coverage Diff                  @@
##           gcos4gnucobol-3.x      #70      +/-   ##
=====================================================
+ Coverage              65.15%   65.17%   +0.01%     
=====================================================
  Files                     34       34              
  Lines                  56063    56090      +27     
  Branches               14672    14685      +13     
=====================================================
+ Hits                   36529    36554      +25     
- Misses                 13654    13656       +2     
  Partials                5880     5880              
Impacted Files Coverage Δ
cobc/pplex.l 71.53% <93.33%> (+0.04%) :arrow_up:
cobc/ppparse.y 60.86% <96.77%> (+1.34%) :arrow_up:
cobc/cobc.c 70.83% <100.00%> (+0.03%) :arrow_up:
cobc/config.def 100.00% <100.00%> (ø)
libcob/common.c 55.65% <0.00%> (-0.14%) :arrow_down:
cobc/tree.c 73.35% <0.00%> (+0.14%) :arrow_up:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

GitMensch commented 1 year ago

Just to recheck before a deeper review:

nberth commented 1 year ago

Just to recheck before a deeper review:

  • "strict" here means that requested partial replacements are not done if the match is "complete", correct?

Yes indeed. In other words, a LEADING match is "strict" if the matching word is strictly longer than the matched prefix (and similarly for TRAILING).

  • that patch changes the general behaviour (not sure?)

This only changes behaviors of partial replacement (i.e, LEADING or TRAILING) for which the source token is a literal, which is a GCOS extension (there's a dialect option partial-replacing-with-literal about this, which is obsolete by default).

These added tests fail similarly with and without the changes of this PR. If so, is there a pending patch somewhere about these changes, so I check whether it still applies after this PR?

GitMensch commented 1 year ago

These added tests fail similarly with and without the changes of this PR.

Thank you for checking.

If so, is there a pending patch somewhere about these changes, so I check whether it still applies after this PR?

That would be https://sourceforge.net/p/gnucobol/patches/54/.

nberth commented 1 year ago

If so, is there a pending patch somewhere about these changes, so I check whether it still applies after this PR?

That would be https://sourceforge.net/p/gnucobol/patches/54/.

I've just checked: patch_replacement.patch still applies fine after this PR, and all tests (including those added by the patch tests.patch mentioned above) pass.

GitMensch commented 1 year ago

Thanks for checking. The tests in that patch look good (maybe at least some should also be "copied" to the listing tests, just mark as expected fail for now if they don't pass) and the change looks reasonable to.

@nberth Could you please commit those in behalf of @sbelondr (I'll post-adjust the user property) to 3.x if you feel the same?

For this PR: I'm still checking, just one note: MF won't compile this non-standard code, but it looks like ACU would and likely don't enforce the strictness (I try to vamp up my old machine, maybe I find something) :-/

GitMensch commented 1 year ago

Checked: ACUCOBOL around 1997 does not handle those literals, the version around 2001 does, erroring out (after adding a mandatory section or paragraph) with:

prog.cob, line 18: Undefined data item: NORM
prog.cob, line 18: Verb expected, NO found

the current version of cobc says:

prog.cob: in paragraph 'main':
prog.cob:18: error: 'NORM' is not defined

So I guess the patch would fix GnuCOBOL -std=gcos and break -std=acu?

nberth commented 1 year ago

... but the most important part is: that seems to break the "non-strict" behaviour of ACUCOBOL, no?

Indeed the behavior w.r.t strictness should be different with -std=acu. Instead of adding an additional dialect option that would only be relevant when -fpartial-replacing-with-literal is not unconformable, maybe a quick but good-enough fix would be to just check for CB_STD_ACU before setting a partial match as strict. What do you think?

GitMensch commented 1 year ago

maybe a quick but good-enough fix would be to just check for CB_STD_ACU before setting a partial match as strict. What do you think?

seems like something that could be applied soon enough TM, the default strictness should be acu as that came earlier so check for CB_STD_GCOS instead. I suggest to make that adjustment along with the move to tree.h, then either re-request a review or directly commit.

... and ideally directly commit the other partial replace fix from the referenced ticket shortly afterwards.

Thank you very much!

nberth commented 1 year ago

the default strictness should be acu as that came earlier so check for CB_STD_GCOS instead.

That's what I was thinking. But we did not add CB_STD_GCOS to avoid those kinds of hard-coded checks… maybe we need to to that in the end.

GitMensch commented 1 year ago

the default strictness should be acu as that came earlier so check for CB_STD_GCOS instead.

That's what I was thinking. But we did not add CB_STD_GCOS to avoid those kinds of hard-coded checks… maybe we need to to that in the end.

or, as a hack which is done similar in another case: give skip a special meaning like "support, but skip if it isn't partial...

nberth commented 1 year ago

or, as a hack which is done similar in another case: give skip a special meaning like "support, but skip if it isn't partial...

Do you mean -fpartial-replacing-with-literal=skip would be ACU's behavior of replacing complete matches (i.e. treating the REPLACING source literal as if it was pseudo-text)? This would make GCOS's behavior the default unless that is given skip, right?

nberth commented 1 year ago

@nberth Could you please commit those in behalf of @sbelondr (I'll post-adjust the user property) to 3.x if you feel the same?

Sure I'll do this tomorrow.

GitMensch commented 1 year ago

or, as a hack which is done similar in another case: give skip a special meaning like "support, but skip if it isn't partial...

Do you mean -fpartial-replacing-with-literal=skip would be ACU's behavior of replacing complete matches (i.e. treating the REPLACING source literal as if it was pseudo-text)? This would make GCOS's behavior the default unless that is given skip, right?

Nearly, more like -fpartial-replacing-with-literal=skip` would be GCOS's behavior of skipping the replacement if it isn't a full match ;-)

nberth commented 1 year ago

Nearly, more like -fpartial-replacing-with-literal=skip` would be GCOS's behavior of skipping the replacement if it isn't a full match ;-)

So the option should rather be named something like, e.g, full-partial-replacing-when-literal-src… (but then values like unconformable and obsolete might not be that clear: the underlying feature is about support of literals, not the semantics on replacement)

nberth commented 1 year ago

Turns out I've misread the documentation. Replacement happens "only if it results in a legal word" means the behavior differs depending on whether the right operand is SPACES or just spaces and separators, or whether it is also a valid prefix.

nberth commented 1 year ago

@GitMensch I've updated the PR description to better match the issue. In any case, tests on a GCOS appear to show that the right operand of REPLACE LEADING/TRAILING must not contain spaces (and maybe separators),

GitMensch commented 1 year ago

Just tested with ACU 2001: "Partial replacement with literals" is compiled as-is. The other test case is totally broken as it does remove the X:

       IDENTIFICATION   DIVISION.
       PROGRAM-ID.      prog.
       DATA             DIVISION.
       WORKING-STORAGE  SECTION.
       COPY "copy.inc"
            REPLACING LEADING "TEST" BY "FIRST"
                      LEADING "NORM" BY "SECOND".
                      LEADING "NORM" BY "SECOND"
                      LEADING "X" BY "".
       COPY "copy2.inc"
            REPLACING TRAILING "FIRST" BY "VAR1"
                      TRAILING "SECOND" BY "VAR2".
                      TRAILING "SECND" BY "VAR2"
                      TRAILING "Y" BY "".
       PROCEDURE        DIVISION.
           DISPLAY FIRST-VAR NO ADVANCING
           END-DISPLAY.
           DISPLAY SECOND-VAR NO ADVANCING
           END-DISPLAY.
           DISPLAY TEST-VAR1 NO ADVANCING
           END-DISPLAY.
           DISPLAY TEST-VAR2 NO ADVANCING
           END-DISPLAY.
           DISPLAY FIRST-VAR  NO ADVANCING END-DISPLAY.
           DISPLAY SECOND-VAR NO ADVANCING END-DISPLAY.
           DISPLAY X          NO ADVANCING END-DISPLAY.
           DISPLAY TEST-VAR1  NO ADVANCING END-DISPLAY.
           DISPLAY TEST-VAR2  NO ADVANCING END-DISPLAY.
           DISPLAY Y          NO ADVANCING END-DISPLAY.
           STOP RUN.

Does this mean we do need the unconformable/ok/skip, and does it mean it works "as expected"?

Do we have an option to shorted the dialect option name?

Just out of interest: did you use the DEBUG_REPLACE?

nberth commented 1 year ago

Does this mean we do need the unconformable/ok/skip, and does it mean it works "as expected"?

No I think ok is fine for ACU then.

Do we have an option to shorted the dialect option name?

Good question. I've yet to find a good name for that…

Just out of interest: did you use the DEBUG_REPLACE?

Yes I've checked this still works.

GitMensch commented 1 year ago

@nberth I guess that will be checked in soon?

nberth commented 1 year ago

Now upstream