github / semantic

Parsing, analyzing, and comparing source code across many languages
8.97k stars 453 forks source link

Add COBOL-85 Language Support #452

Open wbickford-fiserv opened 4 years ago

wbickford-fiserv commented 4 years ago

My organization utilizes COBOL and we've recently started using GitHub Enterprise. I met with @mcolyer to discuss some of our challenges and he recommended filing an issue with this project to get this language on your radar.

This request is two-part: 1) Please consider adding COBOL support in your roadmap, and 2) if that isn't feasible, determine if you could accept a patch from us to get it done? If option 2 is feasible, are there any guides available for adding language support?

File Types: .c85_m and .cbl

patrickt commented 4 years ago

@wbickford-fiserv Hi there; thanks for your interest!

We accept language contributions to the semantic repo, but I can’t guarantee that language support in semantic translates directly to support in the GitHub frontend. In addition, we don’t currently support our code navigation features on GitHub Enterprise; I’ll be sure to let you know if that happens.

However, if you’re still interested in helping us out, the place to start would be to create a tree-sitter grammar for COBOL-85.

wbickford-fiserv commented 4 years ago

Thanks - this seems like a great place to start. Do you know what other systems we'd need to start hitting to improve COBOL support in GitHub Enterprise? Thank you for the grammar link. I'll dig into it.

wbickford-fiserv commented 4 years ago

Cross-referencing: https://github.com/spgennard/vscode_cobol/wiki

XVilka commented 4 years ago

@wbickford-fiserv you can try to see if the ANTL4-based COBOL parser can be handy in creating tree-sitter one.

spgennard commented 4 years ago

This grammar is missing two major verbs COPY/REPLACE which are handled outside of the parsing making it unsuitable for any real-world use... plus it has some glaringly obvious mistakes in the grammar such as being fixed format based.

Then the unfortunate fact is most COBOL sources are not ANS85 compliant but are either IBM Enterprise dialect or Micro Focus dialect.

Sorry to be a downer but I too think this is a good idea but honestly a bad/useless implementation is worse than no implementation.

XVilka commented 4 years ago

@spgennard well, it's open-source, you can open a bug there or send a pull request. But I think it is good enough to be as a starting point for creating a tree-sitter grammar.

spgennard commented 4 years ago

@spgennard well, it's open-source, you can open a bug there or send a pull request. But I think it is good enough to be as a starting point for creating a tree-sitter grammar.

Gee, thanks for telling me it is open-source, I would never have guessed.

I do speak from a degree of understanding of the subject.

Therefore, If anyone wants help in defining a minimum viable set of requirements for the parser, please feel to drop me a note or contact me on my own project.

Neppord commented 4 years ago

Hi every one! I just wanted to swing by and tell you all that I have created a repository for tree-sitter-COBOL https://github.com/Neppord/tree-sitter-cobol. COBOL with it all its dialects is quite a challenge so I would appreciate help.

I will at least try to get a skeleton up and running and helping people contribute to the repository.

A lot of time have passed since this issue was started and i think its time to get some code down.

Neppord commented 4 years ago

A small dent have been made. I can't believe that it have been 26 days since I created the project, things do take longer then expected when running multiple projects. Tree sitter Cobol have now basic support for the Identification Division and is slowly getting into the real stuff.

If someone would like to help out with the basics for a highlighting setup that would be great. also setting up a basic pipeline for github actions would be appreciated.

Neppord commented 4 years ago

@wbickford-fiserv what does .c85_m files contain? Is it any special COBOL, or should they be highlighted the same way as .cbl files?

Are your code base ANSI 85 complaint, or do you need MF or other vendor compatibility?

Neppord commented 4 years ago

I'm trying to get hold of a "pure" ANSI 85 syntax reference, and can't seam to find one (without a pay wall). Anyone that can link to a known good source (paywall or otherwise)?

XVilka commented 4 years ago

@Neppord could you please setup CI in your repository, so it would be ready for the pull requests?

Neppord commented 4 years ago

@XVilka i'm sorry to say that i don't have the time to fix anything with this project currently and the interest have been cold from the community for long enough that i don't see me working on the project with out any external support/encouragement.

I'm currently not working on a COBOL codebase and therefore have not any reson to work on this for my own sake.

If someone else would like to take over I would happily schedule a meeting and share my experience.

The short version of it would be that it is hard to implement the spec gradually due to the lack of tools for incremental design. The test tools only work on complete source files which makes it hard to implement one feature at the time. So working from a existing grammar and convert it is crucial to make this work in a timely manner. The grammar linked above is broken and don't work on most COBOL sources, there is a branch in my repo trying to adapt it but it was just to much work.

thlabbe commented 1 year ago

Hi, Hearing GNU emacs 29 will (soon?) come with tree-sitter support. I feel the need to crossref this issue to cobol-mode.el It gives very basic features ... but at least it's a good source of informations/inspiration on distingos between Cobols dialects, keywords/verb...