Using begin/end/nested and separator

FALLAI-Denis commented 3 years ago

Describe the issue

I want to use a declaration based on a beginRegex / endRegex with option "nested": true, and at the same time a hierarchical declaration of separator.

The "nested": true option seems incompatible with the use of separator: folding is not active even if the begin and end expressions are correctly identified.

With "nested": false there is no conflict with separator, (comment bloc in code example).

To reproduce

VSCode: 1.56.2
Explicit Folding: 0.13.1
Language: JCL
Language Provider: IBM Z Open Editor

Code Example

//MYJOB    JOB ,'MY JOB',CLASS=A,NOTIFY=&SYSUID
//****
//* COMMENTS
//****
//MYPROC   PROC P1='P1',
//         P2='P2'
//MYSTEP   EXEC PGM=MYPROG
//STEPLIB  DD DISP=SHR,DSN=MY.LOADLIB
//IN       DD DISP=SHR,DSN=&P1
//OUT      DD DSN=&P2,
//            DISP=(NEW,CATLG),
//            SPACE=(CYL,(5,5),RLSE)
//MYPROC   PEND
//*
//STEP     EXEC PROC=MYPROC,
//              P1='MY.FIC.ALPHA',
//              P2='MY.FIC.BETA'
//* That's all

Settings

"folding": {
      "jcl": [
         {// Bloc commentaires
          "beginRegex": "^\\/\\/\\*"
         ,"endRegex": "^\\/\\/[^*]"
         ,"foldLastLine": false
         ,"nested": false
         ,"kind":  "comment"
         }
        ,{// PROC - PEND
          "beginRegex": "^\\/\\/[^*]\\S* +PROC(?: |$)"
         ,"endRegex": "^\\/\\/[^*]\\S* +PEND(?: |$)"
         ,"nested": true
         }
        ,{// JOB
          "separatorRegex": "^\\/\\/[^*]\\S* +JOB "
         ,"strict": false
         ,"descendants": [
            {// EXEC
             "separatorRegex": "^\\/\\/[^*]\\S* +EXEC "
            ,"strict": false
            ,"descendants": [
               {// DD
                "separatorRegex": "^\\/\\/[^*]\\S* +DD "
               ,"strict": false
               }
             ]
            }
          ]
         }
      ]
     },

Expected behavior

We should have a folding starting at // MYPROC PROC and ending at // MYPROC PEND.

Screenshots

[document] lang: jcl, fileName: folding.jcl
[main] regex: /(?<_0_0>^\/\/\*)|(?<_0_1>^\/\/[^*]\S* +PROC(?: |$))|(?<_2_1>^\/\/[^*]\S* +PEND(?: |$))|(?<_5_2>^\/\/[^*]\S* +JOB )|(?<_5_3>^\/\/[^*]\S* +EXEC )|(?<_5_4>^\/\/[^*]\S* +DD )/g
[main] line: 1, offset: 0, type: SEPARATOR, match: //MYJOB    JOB , regex: 2
[main] line: 2, offset: 0, type: BEGIN, match: //*, regex: 0
[nested=0] regex: /(?<_2_0>^\/\/[^*])/g
[nested=0] line: 5, offset: 0, type: END, match: //M
[main] line: 5, offset: 0, type: BEGIN, match: //MYPROC   PROC , regex: 1
[main] line: 7, offset: 0, type: SEPARATOR, match: //MYSTEP   EXEC , regex: 3
[main] line: 8, offset: 0, type: SEPARATOR, match: //STEPLIB  DD , regex: 4
[main] line: 9, offset: 0, type: SEPARATOR, match: //IN       DD , regex: 4
[main] line: 10, offset: 0, type: SEPARATOR, match: //OUT      DD , regex: 4
[main] line: 13, offset: 0, type: END, match: //MYPROC   PEND, regex: 1
[main] line: 14, offset: 0, type: BEGIN, match: //*, regex: 0
[nested=0] regex: /(?<_2_0>^\/\/[^*])/g
[nested=0] line: 15, offset: 0, type: END, match: //S
[main] line: 15, offset: 0, type: SEPARATOR, match: //STEP     EXEC , regex: 3
[main] line: 18, offset: 0, type: BEGIN, match: //*, regex: 0
[nested=0] regex: /(?<_2_0>^\/\/[^*])/g
[document] foldings: [{"start":1,"end":3,"kind":1},{"start":9,"end":13,"kind":3},{"start":6,"end":13,"kind":3},{"start":14,"end":17,"kind":3}]

Additional context

If I remove the content between // MYPROC PROC and // MYPROC PEND then the folding is done well:

[document] lang: jcl, fileName: folding.jcl
[main] regex: /(?<_0_0>^\/\/\*)|(?<_0_1>^\/\/[^*]\S* +PROC(?: |$))|(?<_2_1>^\/\/[^*]\S* +PEND(?: |$))|(?<_5_2>^\/\/[^*]\S* +JOB )|(?<_5_3>^\/\/[^*]\S* +EXEC )|(?<_5_4>^\/\/[^*]\S* +DD )/g
[main] line: 1, offset: 0, type: SEPARATOR, match: //MYJOB    JOB , regex: 2
[main] line: 2, offset: 0, type: BEGIN, match: //*, regex: 0
[nested=0] regex: /(?<_2_0>^\/\/[^*])/g
[nested=0] line: 5, offset: 0, type: END, match: //M
[main] line: 5, offset: 0, type: BEGIN, match: //MYPROC   PROC , regex: 1
[main] line: 10, offset: 0, type: END, match: //MYPROC   PEND, regex: 1
[main] line: 11, offset: 0, type: BEGIN, match: //*, regex: 0
[nested=0] regex: /(?<_2_0>^\/\/[^*])/g
[nested=0] line: 12, offset: 0, type: END, match: //S
[main] line: 12, offset: 0, type: SEPARATOR, match: //STEP     EXEC , regex: 3
[main] line: 15, offset: 0, type: BEGIN, match: //*, regex: 0
[nested=0] regex: /(?<_2_0>^\/\/[^*])/g
[document] foldings: [{"start":1,"end":3,"kind":1},{"start":4,"end":9,"kind":3},{"start":11,"end":14,"kind":3},{"start":0,"end":14,"kind":3}]

daiyam commented 3 years ago

Here a possible configuration (It's more complex):

"folding": {
  "jcl": [
    { // Bloc commentaires
      "beginRegex": "^\\/\\/\\*",
      "endRegex": "^\\/\\/[^*]",
      "foldLastLine": false,
      "nested": false,
      "kind": "comment"
    },
    { // EXEC
      "name": "exec",
      "strict": "never",
      "separatorRegex": "^\\/\\/[^*]\\S* +EXEC ",
      "nested": [
        { // DD
          "separatorRegex": "^\\/\\/[^*]\\S* +DD ",
        }
      ]
    },
    { // JOB
      "separatorRegex": "^\\/\\/[^*]\\S* +JOB ",
      "strict": false,
      "nested": [
        { // PROC - PEND
          "beginRegex": "^\\/\\/[^*]\\S* +PROC(?: |$)",
          "endRegex": "^\\/\\/[^*]\\S* +PEND(?: |$)",
          "nested": [
            "exec"
          ]
        },
        "exec"
      ]
    }
  ]
}

There are several changes:

to rename descendants to nested because it's the same meaning/purpose.
add "strict": "never" so the default value for strict become false
add property name to create rule so they can be including
add support for begin/end in nested separator

What do you think? Should it match what you are looking for to do?

FALLAI-Denis commented 3 years ago

Hi @daiyam

Thank you once again for taking my requests into account.

I do not understand if it is an evolution proposal or if it is an already operational solution.

As for the complexity of the implementation of the folding rules, it does'nt matter as long as the extension remains efficient.

I have several use cases that needs begin/end into separator.

Note: I have another problem that I have not yet diagnosed and which concerns the use of the same end expression, for different begin expressions.

I applied the proposal as formulated, but it does not add the folding on the "PROC / PEND" group.

lang: jcl, regex: /(?<_0_0>^\/\/\*)|(?<_5_1>^\/\/[^*]\S* +EXEC )|(?<_5_2>^\/\/[^*]\S* +JOB )/g
line: 1, offset: 0, type: SEPARATOR, match: //MYJOB    JOB , regex: 2
line: 2, offset: 0, type: BEGIN, match: //*, regex: 0
[nested] line: 5, offset: 0, type: END, match: //M
line: 7, offset: 0, type: SEPARATOR, match: //MYSTEP   EXEC , regex: 1
line: 14, offset: 0, type: BEGIN, match: //*, regex: 0
[nested] line: 15, offset: 0, type: END, match: //S
line: 15, offset: 0, type: SEPARATOR, match: //STEP     EXEC , regex: 1
line: 18, offset: 0, type: BEGIN, match: //*, regex: 0
foldings: [{"start":1,"end":3,"kind":1},{"start":6,"end":13,"kind":3},{"start":14,"end":17,"kind":3},{"start":0,"end":17,"kind":3}]

FALLAI-Denis commented 3 years ago

If there should be any breaking changes from previous versions, there may be other changes to consider:

the word "folding" which carries the set of rules should perhaps more explicitly refer to the extension, (there could one day be a conflict with another extension).
be able to express the folding rules independently from one language to another and not all grouped together in the same "folding" object (or other name). For example, hang them up directly to the language concerned:

    "[cobol]": {
        "files.autoGuessEncoding": true,
        "editor.rulers": [0,{"column":6,"color":"#603020"},7,11,{"column":72,"color": "#603020"},{"column": 80,"color": "#ff0000"}],
        "editor.folding": true,
        "editor.showFoldingControls": "always",
        "editor.foldingStrategy": "auto",
        "editor.foldingHighlight": true,
        "explicitFolding.rules": [
          <folding rules for cobol>
         ], 
       "explicitFolding.debug": false
        },
    "[jcl]": {
        "files.autoGuessEncoding": true,
        //"editor.rulers": [0,{"column":6,"color":"#603020"},7,11,{"column":72,"color": "#603020"},{"column": 80,"color": "#ff0000"}],
        "editor.folding": true,
        "editor.showFoldingControls": "always",
        "editor.foldingStrategy": "auto",
        "editor.foldingHighlight": true,
        "explicitFolding.rules": [
          <folding rules for jcl>
         ],
       "explicitFolding.debug": true
        },

Have you considered making child extensions, per language, that would implement the folding rules directly in the extension declaration ("package.json" file)? Or to define a standard framework for who would like to develop such a child extension?

daiyam commented 3 years ago

It was a proposal. So I could have a feedback before making the changes since they are quite substantial.
Same end expressions will fail. If you have an example, I will think about it (maybe in new issue).
I could make folding deprecated but still working. (maybe with an alert to make the user changes their config)
yay, the per-language configuration should be supported
I could make a library so another extension could use it. Have you tried the extension https://marketplace.visualstudio.com/items?itemName=bitlang.cobol? Since it's open source, we could make a PR adding the foldings so that the language definition, syntax highlighting and foldings are in the same extension.

FALLAI-Denis commented 3 years ago

For the languages support extension for COBOL, JCL (but also REXX, HLASM and PL/I) , we use "IBM Z Open Editor".

We also tested the extension "COBOL Language Support" from CA Broadcom, which is in fact an evolution of that of bitlang.

The IBM extension is richer than the other offers. I have asked IBM to consider language-based folding rather than simple indentations, but right now that request is not their priority. Cf. https://github.com/IBM/zopeneditor-about/issues But I will not fail to advertise your extension to users of "IBM Z Open Editor" as soon as it stabilizes with respect to my many requests.

For us, the "IBM Z Open Editor" extension is also more natural because it corresponds to our usual working context (IBM z/OS mainframe). It also includes other facilities that do not exist in the other extensions.

FALLAI-Denis commented 3 years ago

Regarding the evolution proposal, that suits me. I think these developments can open up a lot of possibilities. And in the end, you are the boss and you decide which direction to take!

daiyam commented 3 years ago

Hi Denis, here the lastest update: explicit-folding-0.13.1.vsix

I've used the following configuration:

"[jcl]": {
  "explicitFolding.debug": true,
  "explicitFolding.rules": [
    { // Bloc commentaires
      "name": "comment",
      "beginRegex": "^\\/\\/\\*",
      "endRegex": "^\\/\\/[^*]",
      "foldLastLine": false,
      "nested": false,
      "kind": "comment"
    },
    { // JOB
      "separatorRegex": "^\\/\\/[^*]\\S* +JOB ",
      "strict": "never",
      "nested": [
        { // PROC - PEND
          "name": "proc",
          "beginRegex": "^\\/\\/[^*]\\S* +PROC(?: |$)",
          "endRegex": "^\\/\\/[^*]\\S* +PEND(?: |$)",
          "nested": [
            { // EXEC
              "separatorRegex": "^\\/\\/[^*]\\S* +EXEC ",
              "nested": [
                { // DD
                  "separatorRegex": "^\\/\\/[^*]\\S* +DD ",
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

The rule begin/end with nested = [...] is supporting strict = false which avoid the need to repeat the rules. The property name is used for adding clarity in the debug.

FALLAI-Denis commented 3 years ago

Hi @daiyam

Once again thank you very much for considering my requests.

I'm looking at these new developments and getting back to you as soon as possible.

FALLAI-Denis commented 3 years ago

Hi @daiyam

I adapted my rules to the new syntax and made some tests: everything seems OK to me. I think this new version can be officially released.

In the future I may have other suggestions to make, but for now we'll take a break.

Thank you again for all your efforts, your involvement and your responsiveness.

daiyam commented 3 years ago

Great! I have to improve the documentation then I will officially release the new version.

Regarding the child extension, I should be able to do it by requiring the current extension and by passing the predefined configuration for the language.

If you have good configs, don't hesitate to share them into the discussions.

daiyam commented 3 years ago

The new version has been published.

I've added the rule while to be able to do:

{ // Bloc commentaires
  "kind": "comment",
  "whileRegex": "^\\/\\/\\*"
}

FALLAI-Denis commented 3 years ago

Hi @daiyam

Begin/While is something I need for some folding cases, but in these cases "while" must be evaluated from the first line, (the "begin" line).

These cases concerns line with continuation character, (the last signifiant character of the line).

Sample:

//LABEL DD BLA,BLA,BLA, // BLA,BLA,BLA, // BLA,BLA

Foldind region "begin" with line containing "DD" word, and extend "while" a comma is present at last signifiant position of the line.

Other sample:

//LABEL DD BLA,BLA,BLA //NOT DD BLA,BLA,BLA, // BLA

"LABEL" line does'nt have continuation comma : no folding region.

"NOT" line has a continuation character but is not continuation of "LABEL" line.

So, there is two cases to consider:

continuation mark is on preceding line (continuation start at first line in a range)
continuation mark is on following line (continuation start at second line in a range)

Perhaps this can be handle with a boolean indicator associated with "while"/"whileregex" ?

Or use of "while" and "until" expressions:

"while": analyze start at first line ("begin" line), and while expression is true ; last line (expression is false) is by default in the folding region
"until": analyze start at second line (first line after "begin" line), and while expression if false ; last line (expression is true) is by default in the folding region

Use case for "until": in COBOL continuation of string value is associated with a dash character in column at position 7

123456 display "Helloxxxxxx.... 123456- "World !"

daiyam commented 3 years ago

continuation mark is on preceding line

The rule begin/continuation should be able to do it.

{
  "begin": "DD",
  "continuation": ","
}

continuation mark is on following line

The rule begin/while should do it.

{
  "begin": "DD",
  "whileRegex": "^\\/\\/(?!=NOT).*,$"
}

A begin/until is the same as a begin/end. I've almost renamed the rule...

If you give me some examples, I should be able to test more precisely.

daiyam commented 3 years ago

Hi @FALLAI-Denis,

I've made a change which is the comment block with begin/end but the following rule is working:

"[jcl]": {
  "explicitFolding.rules": [
    { // Bloc commentaires
      "name": "comment",
      "beginRegex": "^\\/\\/\\*",
      "endRegex": "^\\/\\/[^*]",
      "consumeEnd": false,
      "nested": false,
      "kind": "comment"
    },
}

I've add the property consumeEnd to indicate that the line match by endRegex isn't part of the region. It's to avoid conflict found in C++ with #51.

daiyam commented 3 years ago

Hi @FALLAI-Denis,

Can you tell me the expected foldings for DD on the following (ignore the line numbers):

 1 //MYPROC   PROC P1='P1',
 2 //         P2='P2'
 3 //*
 4 //* SECOND COMMENT BLOCK
 5 //*
 6 //MYSTEP   EXEC PGM=MYPROG
 7 //STEPLIB  DD DISP=SHR,DSN=MY.LOADLIB
 8 //IN       DD DISP=SHR,DSN=&P1
 9 //OUT      DD DSN=&P2,
10 //            DISP=(NEW,CATLG),
11 //            SPACE=(CYL,(5,5),RLSE)
12 //*
13 //A         DD  dfjdlfj,
14 //            ldfjdflkj,
15 //            fkmfkdfpk
16 /*
17 //MYPROC   PEND

FALLAI-Denis commented 3 years ago

Regarding folding on DD statements, I'm trying to implement the following rule:

the folding begins on the present of the word DD, on a line which begins with the sequence // which can be immediately followed by a sequence of characters which does not begin with the character * (//* is a comment line)
the folding extends over the following lines, continues as long as the current line ends with a , character after the first sequence of characters, not contiguous to the start of line sequence //, (the first blank after the sequence of characters indicates the end of the useful data, the characters after the first the first blank are comments
there may be comment lines (//* at the start of the line) which are inserted in the sequence of lines concerned by the bending

In the example above, the regions to fold with respect to the presence of DD are:

from line 9 to line 11
from line 13 to line 15

Lines 11 and 16 are not part of the folding.

On the other hand, if comment lines (//*) had been present between lines 9 and 11, or lines 13 and 15, they would have been part of the folding region concerned.

The generic form of a folding region for a DD declaration is:

//LABEL DD WORD,WORD,WORD,    end of line comment
//* comment line inside folding region (continuation comma on previous not comment line)
//* another comment line inside folding region (continuation comma on previous not comment line)
//         WORD,WORD,WORD,    end of line comment
//         WORD
//* comment line outside of floding region (no continuation comma on not comment previous line)

daiyam commented 3 years ago

Yep, I can't make a working rule for those conditions... yet. The issue is with the comment inside the region.

daiyam commented 3 years ago

Can you try:

{ // JOB
    "name": "job",
    "separatorRegex": "^\\/\\/[^*]\\S* +JOB(?: |$)",
    "strict": "never",
    "nested": [
        { // PROC - PEND
            "name": "proc",
            "beginRegex": "^\\/\\/[^*]\\S* +PROC(?: |$)",
            "endRegex": "^\\/\\/[^*]\\S* +PEND(?: |$)",
            "nested": [
                { // EXEC
                    "name": "exec",
                    "separatorRegex": "^\\/\\/[^*]\\S* +EXEC ",
                    "nested": [
                        {
                            "name": "dd",
                            "beginRegex": "^\\/\\/[^*]\\S* +DD .*,$",
                            "endRegex": "^\\/\\/(?!(?:\\*|\\S* +DD \\S+,(?: |$)| +\\S+,(?: |$)))"
                        },
                        { // Bloc commentaires
                            "whileRegex": "^\\/\\/\\*",
                            "kind": "comment"
                        },
                    ]
                }
            ]
        }
    ]
}

It's a bit hacky but it seems to work... I'm using endRegex to consume the last line which won't be a continuation line.

zokugun / vscode-explicit-folding