zokugun / vscode-explicit-folding

Customize your Folding for Visual Studio Code
MIT License
95 stars 14 forks source link

Problem(s) with capturing groups #62

Open FALLAI-Denis opened 2 years ago

FALLAI-Denis commented 2 years ago

Describe the issue

The expression of a back reference on an "endRegex" rule to a capaturing group in a "beginRegex" is parsed at the time of the "compilation" of the set of rules, and not at the time of execution. At compile time, if the capture group contains wildcard expressions (like .) then the captured value cannot be known at compile time, and cannot be correctly used in the back reference at run time. This can wrongly cause rule validations which then disrupts the operation of the other rules (the analysis stops at the first rule satisfied).

I have tried using begin/while and begin/continuation rules, which introduce lazy processing of end-of-block rules, but in this case backreference processing doesn't seem to work.

To reproduce

Code Example

//DATA     DD DATA,DLM=$$
dfhdfgh
odfjgfg
fjgldfg
ldfjglfdjg
$$

DLM=$$ determines the value of the sting characters which represent the end of flow delimiter. $$ can ben any string from 2 to 8 characters.

DLM parameter

Settings

  ,{// DD INSTREAM DATA with DLM
    "name": "dd2"
   ,"beginRegex": "^\\/\\/[^*]\\S* +DD +DATA,DLM=(.{2,8})(?: |,|$)"
   ,"endRegex": "^\\1"
   }

Expected behavior

When an end-of-block rule contains a back reference, then it should not be part of the set of rules built at compile time, but interpreted at run time. As a corollary, the start of block rule corresponding to the end of block rule must be marked to indicate that the end of block rule is deferred processing.

Additional context

[document] lang: jcl, fileName: folding.jcl
[main] regex: / ... |(?<_0_5>^\/\/[^*]\S* +DD +DATA,DLM=(.{2,8})(?: |,|$))|(?<_2_5>^.{2,8})| .../g

The rule (?<_2_5>^.{2,8}) capture anything...

I suggest that the capturing group values are returned in the debug log.

FALLAI-Denis commented 2 years ago

Full set of rules:

    "[jcl]": {
        "files.autoGuessEncoding": true,
        "editor.rulers": [0,{"column":11,"color":"#603020"},{"column":15,"color":"#603020"},{"column":71,"color": "#603020"},{"column":72,"color": "#603020"},{"column":80,"color": "#ff0000"}],
        /* Requirement for Explicit Folding */
        "editor.folding": true,
        "editor.foldingHighlight": true,
        "editor.foldingStrategy": "auto",
        "editor.showFoldingControls": "always",
        /* Explicit Folding */
        "explicitFolding.debug": true,
        "explicitFolding.rules": [
           {// Bloc commentaires
             "name": "comment"
            ,"whileRegex": "^\\/\\/\\*"
            ,"kind": "comment"
           }
           // Hiérarchie JCL
          ,{// JOB
            "name": "job"
           ,"separatorRegex": "^\\/\\/[^*]\\S* +JOB(?:( |$))"
           ,"strict": "never"
           ,"nested": [
              {// PROC - PEND
               "name": "proc"
              ,"beginRegex": "^\\/\\/[^*]\\S* +PROC(?: |$)"
              ,"endRegex": "^\\/\\/[^*]\\S* +PEND(?: |$)"
              ,"foldLastLine": true
              ,"nested": [
                 {// STEP
                  "name": "exec"
                 ,"separatorRegex": "^\\/\\/[^*]\\S* +EXEC "
                 ,"nested": [
                    {// DD INSTREAM DATA without DLM
                     "name": "dd-instream1"
                    ,"beginRegex": "^\\/\\/[^*]\\S* +DD +\\*"
                    ,"whileRegex": "^[^\\/][^\\*\\/]"
                    ,"foldEOF": true
                    ,"foldLastLine": true
                    ,"nested": true
                   }
                  ,{// DD INSTREAM DATA with DLM
                    "name": "dd2"
                   ,"beginRegex": "^\\/\\/[^*]\\S* +DD +DATA,DLM=(.{2,8})(?: |,|$)"
                   ,"endRegex": "^\\1"
                   }
                  ,{// DD FILE
                    "name": "dd-file"
                   ,"beginRegex": "^\\/\\/[^*]\\S* +DD +\\S+,(?: |$)"
                   ,"endRegex": "^\\/\\/ +\\S+[^,](?: |$)"
                   ,"nested": true
                   }
                 ]
                }
               ]
             }
            ]
           }
         ]
      },

Use case:

//MYJOB    JOB ,'MY JOB',CLASS=A,NOTIFY=&SYSUID
//****
//* FIRST COMMENT BLOCK
//****
//MYPROC   PROC P1='P1',
//         P2='P2'
//*
//* SECOND COMMENT BLOCK
//*
//MYSTEP   EXEC PGM=MYPROG
//STEPLIB  DD DISP=SHR,DSN=MY.LOADLIB
//IN       DD DISP=SHR,DSN=&P1
//OUT      DD DSN=&P2,
//            DISP=(NEW,CATLG),
//*
//*
//            SPACE=(CYL,(5,5),RLSE)
//*
//A         DD  dfjdlfj,
//            ldfjdflkj,
//            fkmfkdfpk
//*
//SYSIN    DD *
dfhdfgh
odfjgfg
fjgldfg
ldfjglfdjg
/*
//*
//DATA     DD DATA,DLM=$$
dfhdfgh
odfjgfg
fjgldfg
ldfjglfdjg
//*
//*
//*
/*
$$
//*
//MYPROC   PEND
//*
//STEP     EXEC PROC=MYPROC,
//              P1='MY.FIC.ALPHA',
//              P2='MY.FIC.BETA'
//*
//STEPNAME    EXEC PGM=IEFBR14
//THEFILE  DD   DSN=HLQ.DSN,
//             DISP=(OLD,DELETE)
//* That's all
daiyam commented 2 years ago

From https://github.com/zokugun/vscode-explicit-folding/blob/master/docs/rules/begin-end.md:

Internally, the reference \1 is replaced by its source from beginRegex, so endRegex becomes #end [\w]+.

It gives

The rule (?<_2_5>^.{2,8}) capture anything...

So essentially, I need to change how I handle the capturing groups to be able to fix your issue...

daiyam commented 2 years ago

@FALLAI-Denis Can you try with the v0.18.0? I've fixed several bugs with the capturing groups and I've changed how to handle endRegex with group.

FALLAI-Denis commented 2 years ago

Hi @daiyam

I did a first quick test with the context specified above. At first glance it seems to work, but I need to do little more testing.

Debug result (a little reworked):

[document] lang: jcl, fileName: folding.jcl
[main] regex: /(?<_5_0>^\/\/\*)
              |(?<_4_1>^\/\/[^*]\S* +JOB(?:( |$)))
              |(?<_0_2>^\/\/[^*]\S* +PROC(?: |$))
              |(?<_4_3>^\/\/[^*]\S* +EXEC )
              |(?<_0_4>^\/\/[^*]\S* +DD +\*)
              |(?<_0_5>^\/\/[^*]\S* +DD +DATA,DLM=(.{2,8})(?: |,|$))
              |(?<_0_6>^\/\/[^*]\S* +DD +\S+,(?: |$))
              |(?<_2_6>^\/\/ +\S+[^,](?: |$))
              /g
[main] line: 1, offset: 0, type: SEPARATOR, match: //MYJOB    JOB , regex: 1
[main] line: 2, offset: 0, type: WHILE, match: //*, regex: 0
[main] line: 5, offset: 0, type: BEGIN, match: //MYPROC   PROC , regex: 2
[proc] regex: /(?<_2_2>^\/\/[^*]\S* +PEND(?: |$))
              |(?<_4_7>^\/\/[^*]\S* +EXEC )
              |(?<_0_8>^\/\/[^*]\S* +DD +\*)
              |(?<_0_9>^\/\/[^*]\S* +DD +DATA,DLM=(.{2,8})(?: |,|$))
              |(?<_0_10>^\/\/[^*]\S* +DD +\S+,(?: |$))
              |(?<_2_10>^\/\/ +\S+[^,](?: |$))
              /g
[proc] line: 6, offset: 0, type: END, match: //         P2='P2', regex: 10
[proc] line: 10, offset: 0, type: SEPARATOR, match: //MYSTEP   EXEC , regex: 7
[proc] line: 13, offset: 0, type: BEGIN, match: //OUT      DD DSN=&P2,, regex: 10
[proc] line: 17, offset: 0, type: END, match: //            SPACE=(CYL,(5,5),RLSE), regex: 10
[proc] line: 19, offset: 0, type: BEGIN, match: //A         DD  dfjdlfj,, regex: 10
[proc] line: 21, offset: 0, type: END, match: //            fkmfkdfpk, regex: 10
[proc] line: 23, offset: 0, type: BEGIN, match: //SYSIN    DD *, regex: 8
[proc] line: 30, offset: 0, type: BEGIN, match: //DATA     DD DATA,DLM=$$, regex: 9
[proc] line: 39, offset: 0, type: END, match: $$, regex: 9
[proc] line: 41, offset: 0, type: END, match: //MYPROC   PEND, regex: 2
[main] line: 42, offset: 0, type: WHILE, match: //*, regex: 0
[main] line: 43, offset: 0, type: SEPARATOR, match: //STEP     EXEC , regex: 3
[main] line: 45, offset: 0, type: END, match: //              P2='MY.FIC.BETA', regex: 6
[main] line: 46, offset: 0, type: WHILE, match: //*, regex: 0
[main] line: 47, offset: 0, type: SEPARATOR, match: //STEPNAME    EXEC , regex: 3
[main] line: 48, offset: 0, type: BEGIN, match: //THEFILE  DD   DSN=HLQ.DSN,, regex: 6
[main] line: 49, offset: 0, type: END, match: //             DISP=(OLD,DELETE), regex: 6
[main] line: 50, offset: 0, type: WHILE, match: //*, regex: 0
[document] foldings: [{"start":1,"end":3,"kind":1}
                     ,{"start":12,"end":16,"kind":3}
                     ,{"start":18,"end":20,"kind":3}
                     ,{"start":22,"end":26,"kind":3}
                     ,{"start":29,"end":38,"kind":3}
                     ,{"start":9,"end":39,"kind":3}
                     ,{"start":4,"end":40,"kind":3}
                     ,{"start":42,"end":45,"kind":3}
                     ,{"start":47,"end":48,"kind":3}
                     ,{"start":46,"end":49,"kind":3}
                     ,{"start":0,"end":49,"kind":3}]

I noticed a quirk on the following case:

image

I have a folding on the comment block, lines 12 to 14, which is found between DLM=$$ and $$ while I indicated "nested": false The same sequence of text inside a PROC / PEND block does not cause folding.

[document] lang: jcl, fileName: folding2.jcl
[main] regex: /(?<_5_0>^\/\/\*)
              |(?<_4_1>^\/\/[^*]\S* +JOB(?:( |$)))
              |(?<_0_2>^\/\/[^*]\S* +PROC(?: |$))
              |(?<_4_3>^\/\/[^*]\S* +EXEC )
              |(?<_0_4>^\/\/[^*]\S* +DD +\*)
              |(?<_0_5>^\/\/[^*]\S* +DD +DATA,DLM=(.{2,8})(?: |,|$))
              |(?<_0_6>^\/\/[^*]\S* +DD +\S+,(?: |$))
              |(?<_2_6>^\/\/ +\S+[^,](?: |$))
              /g
[main] line: 1, offset: 0, type: SEPARATOR, match: //JOBNAME JOB , regex: 1
[main] line: 2, offset: 0, type: END, match: // MSGLEVEL=(1,1),REGION=0M,NOTIFY=&SYSUID.,SYSAFF=ANY, regex: 6
[main] line: 3, offset: 0, type: WHILE, match: //*, regex: 0
[main] line: 4, offset: 0, type: SEPARATOR, match: //STEP     EXEC , regex: 3
[main] line: 5, offset: 0, type: BEGIN, match: //THEFILE  DD DSN=HLQ.DSN,, regex: 6
[main] line: 6, offset: 0, type: END, match: //            DISP=(OLD,DELETE), regex: 6
[main] line: 7, offset: 0, type: BEGIN, match: //DATA2    DD DATA,DLM=$$, regex: 5
[main] line: 12, offset: 0, type: WHILE, match: //*, regex: 0
[main] line: 16, offset: 0, type: END, match: $$, regex: 5
[main] line: 17, offset: 0, type: WHILE, match: //*, regex: 0
[document] foldings: [{"start":4,"end":5,"kind":3}
                     ,{"start":11,"end":13,"kind":1}
                     ,{"start":6,"end":15,"kind":3}
                     ,{"start":3,"end":16,"kind":3}
                     ,{"start":0,"end":16,"kind":3}]
                ,{// DD INSTREAM DATA with DLM
                  "name": "dd2"
                 ,"beginRegex": "^\\/\\/[^*]\\S* +DD +DATA,DLM=(.{2,8})(?: |,|$)"
                 ,"endRegex": "^\\1"
                 ,"nested": false
                 }

Folding_rules_jcl.txt

FALLAI-Denis commented 2 years ago

PS : also lost folding on comments blocs inside PROC/PEND... don't thound why

image

daiyam commented 2 years ago

The issue with "nested": false is fixed in the v0.8.1.

daiyam commented 2 years ago

Here the config to have the comments everywhere:

 // Hiérarchie JCL
{// JOB
   "name": "job"
  ,"separatorRegex": "^\\/\\/[^*]\\S* +JOB(?: |$)"
  ,"strict": "never"
  ,"nested": [
    {// PROC - PEND
    "name": "proc"
    ,"beginRegex": "^\\/\\/[^*]\\S* +PROC(?: |$)"
    ,"endRegex": "^\\/\\/[^*]\\S* +PEND(?: |$)"
    ,"nested": [
        {// STEP
        "name": "exec"
        ,"separatorRegex": "^\\/\\/[^*]\\S* +EXEC "
        ,"nested": [
           {// Bloc commentaires
            "name": "comment"
           ,"whileRegex": "^\\/\\/\\*"
           ,"kind": "comment"
          }
          ,{// DD INSTREAM DATA without DLM
            "name": "dd-instream1"
           ,"beginRegex": "^\\/\\/[^*]\\S* +DD +\\*"
           ,"whileRegex": "^[^\\/][^\\*\\/]"
           ,"foldEOF": true
          }
          ,{// DD INSTREAM DATA with DLM
            "name": "dd2"
           ,"beginRegex": "^\\/\\/[^*]\\S* +DD +DATA,DLM=(.{2,8})(?: |,|$)"
           ,"endRegex": "^\\1"
           ,"nested": false
          }
          ,{// DD FILE
            "name": "dd-file"
           ,"beginRegex": "^\\/\\/[^*]\\S* +DD +\\S+,(?: |$)"
           ,"endRegex": "^\\/\\/ +\\S+[^,](?: |$)"
          }
        ]
      }
    ]
  }
 ]
}

I've just move the comment block to the lowest level.

daiyam commented 2 years ago

@FALLAI-Denis Hi, were you able to test with the newer version?