zokugun / vscode-explicit-folding

Customize your Folding for Visual Studio Code
MIT License
103 stars 14 forks source link

Folding for COBOL language #39

Closed FALLAI-Denis closed 3 years ago

FALLAI-Denis commented 3 years ago

Describe the issue

I would like to use the extension to implement folding on the COBOL language with respect to its structure:

The particularity of this language is that there is not necessarily an end marker, or that this end marker can be common to several start markers. This is the case for the DIVSION, SECTION and paragraph hierarchical structures: one structure ends where another structure of the same or parent level begins. This is also the case with the "point space" sequence which marks the end of all open groups. The expression that marks the end of a group may not be part of the group.

I encounter various difficulties in the implementation of regular expressions. I tested these regular expressions on the site https://regex101.com/, where they work, but when I implement them in the settings.json under VS Code, the extension does not recognize them and the fold marks do not appear in the editor.

Questions :

To reproduce

Code Example

       IDENTIFICATION DIVISION.
       PROGRAM-ID. MYPROG.
       DATE-COMPILED.
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       SOURCE-COMPUTER. IBM-370.
       OBJECT-COMPUTER. IBM-370.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
            select DD-FICHIER assign to UT-S-DD.
       DATA DIVISION.
       FILE SECTION.
       FD  DD-FICHIER
           block contains 0 records
           recording mode is F.
           copy DDCOPY.
       WORKING-STORAGE SECTION.
       01  ADATA       PIC X(1).
       LINKAGE SECTION.
       01  APARM       PIC X(10).
       PROCEDURE DIVISION using APARM.
       MAIN SECTION.
       START-OF-RUN.
           display "Hello World !"
           .
       END-OF-RUN.
           goback.

Settings

    "[cobol]": {
      "files.autoGuessEncoding": true,
      "editor.rulers": [0,{"column":6,"color":"#603020"},7,11,{"column":72,"color": "#603020"},{"column": 80,"color": "#ff0000"}],
      "editor.folding": true,
      "editor.showFoldingControls": "always",
      "editor.foldingStrategy": "auto",
      "editor.foldingHighlight": true
      },

    "folding": {
      "cobol": [
        {"beginRegex": "(?<=^.{6}\\s).*?DIVISION",
          "endRegex": ".$(?=\\S*\\s+\\S+\\s+DIVISION)",
          "foldLastLine": true
        }
      ]
    },

Expected behavior

Folding should start with "xxxx DIVISION", and should end before next "xxxx DIVISION" or at end of source texte.

Screenshots

FoldingCobolDivisionStart

FoldingCobolDivisionEnd

Note: no solution for "end of text".

FoldingCobolDivisionResult

No folding...

Additional context

This works but is not satisfactory because it is not generic, and would not be applicable to "SECTION" whose label is not systematically imposed by the language:

    "folding": {
      "cobol": [
          {"beginRegex": "\\bID\\S*\\s+DIVISION",
           "endRegex": ".(?=\\bEN\\S*\\s+DIVISION)",
           "foldLastLine": false
          },
          {"beginRegex": "\\bEN\\S*\\s+DIVISION",
            "endRegex": ".(?=\\bDA\\S*\\s+DIVISION)",
            "foldLastLine": false
           },
           {"beginRegex": "\\bDA\\S*\\s+DIVISION",
            "endRegex": ".(?=\\bPR\\S*\\s+DIVISION)",
            "foldLastLine": false
           },
           {"beginRegex": "\\bPR\\S*\\s+DIVISION",
            "endRegex": "\\*end-of-text",
            "foldLastLine": false
           }
      ]
    },

FoldingCobolDivisionUnfolded

FoldingCobolDivisionFolded

daiyam commented 3 years ago

Have you tried?

"cobol": [
  {
    "separatorRegex": "\\s*\\w+\\s(?:DIVISION|SECTION)\\."
  }
]
FALLAI-Denis commented 3 years ago

Hi @daiyam

Thank you for your reply.

Your solution produces a result by placing folding points on the DIVISION and SECTION, but it does not handle the hierarchical dependency between the two:

SECTIONS are included in DIVISIONS, and paragraph are included in SECTION or DIVISION.

DIVISION, SECTION and paragraph in COBOL language should be seen as H1, H2 and H3 tags in HTML language: the levels are nested within each other :

What would be needed in the extension would be to have a "header/level" type folding mechanism in addition to the "begin / end" and "separator" folding mechanisms.

I try these solutions:

    "folding": {
        "cobol": [
          {
            "separatorRegex": "^.{6}\\s{1,4}\\S+\\s+DIVISION"
          }
        ]
      },

Works as expected for DIVISION.

    "folding": {
        "cobol": [
          {
            "separatorRegex": "^.{6}\\s{1,4}\\S+\\s+SECTION"
          }
        ]
      },

Works as expected for SECTION, (and for start of text, wich is not a SECTION).

Putting both together doesn't works as expected :

Note: "positive lookbehind" doesn't work "separatorRegex": "(?<=^.{6}\\s{1,4})\\S+\\s+SECTION", (works on regex101.com).

FALLAI-Denis commented 3 years ago

I think the problem is with the overlap between the start and end markers.

I managed to get an approximate result with the following configuration:

    "folding": {
      "cobol": [
       {"beginRegex": "\\s{1,4}\\S+\\s+DIVISION"
       ,"endRegex": ".(?=(\\s{1,4}\\S+\\s+DIVISION)|\\*end-of-text)"
       ,"foldLastLine": false
       }
       ,
       {"beginRegex": "\\s{1,4}\\S+\\s+SECTION"
       ,"endRegex": ".(?=.((\\s{1,4}\\S+\\s+(SECTION|DIVISION))|\\*end-of-text))"
       ,"foldLastLine": false
       }
      ]
    },

Several tips implemented:

FoldingCobol

daiyam commented 3 years ago

@FALLAI-Denis I won't have much times this week. This week-end, I will add an option so that an EOF is closing the folding ranges so that you won't need to add *end-of-text. (and the ignore-case option)

FALLAI-Denis commented 3 years ago

Thank you so much.

daiyam commented 3 years ago

I've made the changes in the master branch

Here the steps to test it:

Or uninstall the extension and install explicit-folding-0.11.0.vsix (remove .zip) by dropping the file in the list of extensions.

Here the config I've used:

"folding": {
    "cobol": [
        {
            "beginRegex": "\\s{1,4}\\S+\\s+DIVISION",
            "endRegex": ".(?=\\s{1,4}\\S+\\s+DIVISION)",
            "foldLastLine": false,
            "foldEOF": true,
        },
        {
            "beginRegex": "\\s{1,4}\\S+\\s+SECTION",
            "endRegex": ".(?=.\\s{1,4}\\S+\\s+(?:SECTION|DIVISION))",
            "foldLastLine": false,
            "foldEOF": true,
        },
    ]
},
"explicitFolding.debug": true,
FALLAI-Denis commented 3 years ago

Hi @daiyam

I installed the extension using the second method (uninstall, download the zip, install from the vsix). Reload VS Code. Without modification of the settings.json: always OK Added option "explicitFolding.debug": true: OK, (but I don't know where to find the log) Adds the condition "foldEOF": true to replace the test "end-of-text": OK

Great job!

daiyam commented 3 years ago

The logs for the debug are written into the channel Folding of the panel Output. (View / Output)

FALLAI-Denis commented 3 years ago

All's right ! Very great ! Very happy !

ExplicitFolding

daiyam commented 3 years ago

Great!

I'm thinking to add the support for:

"folding": {
  "cobol": [
    {
      "separatorRegex": "\\s{1,4}\\S+\\s+DIVISION",
      "descendants": [
        {
          "separatorRegex": "\\s{1,4}\\S+\\s+SECTION",
        }
      ]
    }
  ]
}

Because the current working config is a bit much hacky. It works only thank to the spaces at the beginning. (When removing them, I can't get a config working...)

FALLAI-Denis commented 3 years ago

@daiyam

Great idea ! with "cascading" descendants ?

In COBOL language, nested levels are:

  1. Division, with name starting between column 8 and 11, keyword "division" after name
  2. Section, with name starting between column 8 and 11, keyword "section" after name
  3. Paragraph, with name starting between column 8 and 11, no word after name (a "point-space" sequence)
  4. Sentence, start between column 12 and 72, end with a "point-space" sequence
  5. Instruction, start with a language keyword, end with before another instruction, or with a "point-space" sequence

As noted, some "old" language uses formatting related to positions on a text line, often a line limited to 80 characters, (punch card format). Depending on the language, you must ignore the start of the line, and / or ignore the end of the line. In COBOL, columns 1 to 6 are reserved for numbering, and columns 73 to 80 are reserved for marking. Only columns 7 to 72 are to be taken into consideration.

Sometimes even the word can change meaning depending on its position in the line (assembly languages ​​for example). In COBOL, column 7 is reserved for identifying a type of line (comment, debug), columns 8 to 11 are reserved for division, section, paragraph names, and columns 12 to 72 are reserved for instructions.

Possibly, since the extension works in line mode and not globally on all the text, perhaps it would be interesting to be able to specify on which columns on the line the regular expression must be applied.

daiyam commented 3 years ago

Here the updated extension explicit-folding-0.11.0.vsix with the support of descendants only with separator/separatorRegex rules. So the following config is working:

"folding": {
  "cobol": [
    {
      "separatorRegex": "\\s{1,4}\\S+\\s+DIVISION",
      "descendants": [
        {
          "separatorRegex": "\\s{1,4}\\S+\\s+SECTION",
        }
      ]
    }
  ]
}
FALLAI-Denis commented 3 years ago

Hi @daiyam

Please, see my comment https://github.com/zokugun/vscode-explicit-folding/issues/41#issuecomment-827061895

FALLAI-Denis commented 3 years ago

Solved.

daiyam commented 3 years ago

I will wait a few days from publishing the update officially. I'm waiting a feedback from another user to see if I haven't broke anything...