antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
16.96k stars 3.26k forks source link

How can I disable or format the "Generated from" comment in antlr #2634

Open Thaina opened 5 years ago

Thaina commented 5 years ago

I don't want it to show a full path

// Generated from c:\Users\Thaina\Documents\grownith\NCalc2\grammer\NCalc.g4 by ANTLR 4.7.1

Not to mention it always annoying when changing machine and recompile g file, it will change this line. Why it is fullpath when it actually should be relative path?

KvanTTT commented 5 years ago

You can fix it on the postprocessing step.

Thaina commented 5 years ago

@KvanTTT Thank you but are there any document about antlr's postprocessing I could read?

KvanTTT commented 5 years ago

It's not a document, it's just a workaround :)

Thaina commented 5 years ago

@KvanTTT Oh that's sad. I just think that you mentioned about antlr's own postprocessing feature

ericvergnaud commented 5 years ago

This is actually a bug in certain targets. In Java, C#, Python 2, Python 3 and JavaScript, only the file name appears in the comment, not the full path

Thaina commented 5 years ago

@ericvergnaud I use C#

ericvergnaud commented 5 years ago

You must be using the tunnelvisionlabs distribution then, not the official one from ANTLR.

Thaina commented 5 years ago

@ericvergnaud I'm not so sure but I use antlr from https://github.com/mike-lischke/vscode-antlr4

ericvergnaud commented 5 years ago

that is not the official version (although mike is a key contributor to it)

Le 26 août 2019 à 23:27, Thaina Yu notifications@github.com a écrit :

@ericvergnaud https://github.com/ericvergnaud I'm not so sure but I use antlr from https://github.com/mike-lischke/vscode-antlr4 https://github.com/mike-lischke/vscode-antlr4 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/2634?email_source=notifications&email_token=AAZNQJDWUWJK3RLM6DKGIKLQGPY6RA5CNFSM4IPH6C4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5EW6HA#issuecomment-524906268, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZNQJDUI2FBLA6OJSQWA6DQGPY6RANCNFSM4IPH6C4A.

Thaina commented 5 years ago

@ericvergnaud I see, thank you for your information

mike-lischke commented 5 years ago

@ericvergnaud Can you be more specific about the bug you mentioned? In the STG files I see always the same line (with an extra parameter for Java). They all use // Generated from <grammarFileName> by ANTLR <ANTLRVersion> where grammarFileName is provided by ANTLR.

sharwell commented 4 years ago

The header contains the path provided to the command line tool. It could be a filename only (if the current working directory contains the grammar), a relative path (e.g. the current working directory is the repository root, and ANTLR is invoked with a repository file path), or a fully qualified path. In all cases, the path is controlled by the tool (or user) which invokes ANTLR and not by the core ANTLR tool. Some of the tools that invoke ANTLR are part of this repository while others are not.

In this case, it sounds like the file is being generated by an external extension, so that extension would need to be updated to use a relative path instead of a full path.

bananasmoothii commented 4 years ago

I have the same problem, I don't want the absolute path to show up because there is my real name in it, and it's a bit boring to change it every time I want to regenerate the files. I could also make a mistake and upload my hole project's directory without removing my name from the antlr4 files. PS: I'm using the Intellij idea plugin (for Java)

sebersole commented 3 years ago

The header contains the path provided to the command line tool. It could be a filename only (if the current working directory contains the grammar), a relative path (e.g. the current working directory is the repository root, and ANTLR is invoked with a repository file path), or a fully qualified path. In all cases, the path is controlled by the tool (or user) which invokes ANTLR and not by the core ANTLR tool. Some of the tools that invoke ANTLR are part of this repository while others are not.

In this case, it sounds like the file is being generated by an external extension, so that extension would need to be updated to use a relative path instead of a full path.

Using this "relative path" approach unfortunately does not play well with tools. For example I am struggling with this issue in a Gradle task and Gradle's task caching. I don't have control of the pwd.

I mean it seems so easy to just add this as a switch.

ericvergnaud commented 3 years ago

As mentioned by @sharwell the problem originates from the plugin/tool providing a full path to the ANTLR4 tool instead of a relative one. For plugins/tools supported by ANTLR4, someone would need to submit a PR such that a working directory is set for the ANTLR4 tool and a relative path is used for the grammars

sebersole commented 3 years ago

Can I ask why y'all are so against this?

The tool writes something into the generated file that serves no functional purpose (and causes problems in cases) and that multiple users have said they simply do not want written. Just curious on the reasoning here.

sebersole commented 3 years ago

BTW, not sure how this is supposed to happen, but I used this following set up:

Working dir : /home/sebersole/projects/hibernate-orm/wip-6/hibernate-core
Relative grammar path : src/main/antlr/org/hibernate/grammars/importsql/SqlScriptParser.g4

And yep, the full path is still written into the file

// Generated from /home/sebersole/projects/hibernate-orm/wip-6/hibernate-core/src/main/antlr/org/hibernate/grammars/hql/HqlLexer.g4 by ANTLR 4.9.1
mike-lischke commented 3 years ago

@sebersole There are two aspects here:

  1. The name of the grammar, from which the source code was generated, is added to the source code. Why's that done at all?
  2. Source code generated by my vscode extension always contains the full path to the grammar.

For 1) I'm all for removing it. It serves no real purpose. For 2) The extension manages grammars in source context classes, which require an id to identify the grammar uniquely, even in cases where you have nested includes, shared grammars etc. I selected the (full) path as that id, simply because that's unique and exists already. This path is also used for generating the target file and by using absolute paths the extension has no problem to find a grammar and avoids headaches for the user, to manage multiple grammar projects or even versions, in different folders. That means, it's not easy to make this work with relative paths, but I'm willing to accept a PR to change the implementation, if someone really wants that.

sebersole commented 3 years ago

All I want is (1). It's pointless for that line in the generated file.

As for your (2), I could care less except that someone said earlier that using a relative path would do the same. I have no idea why a relative versus absolute would make any difference. And honestly I don't care, I just don't want that line added

On Mon, Aug 16, 2021, 5:09 AM Mike Lischke @.***> wrote:

@sebersole https://github.com/sebersole There are two aspects here:

  1. The name of the grammar from which the source code was generated is added to the source code. Why's that done at all?
  2. Source code generated by my vscode extension always contains the full path to the grammar.

For 1) I'm all for removing it. It serves no real purpose. For 2) The extension manages grammars in source context classes, which require an id to identify the grammar uniquely, even in cases where you have nested includes, shared grammars etc. I selected the (full) path as that id, simply because that's unique and exists already. This path is also used for generating the target file and by using absolute paths the extension has no problem to find a grammar and avoids headaches for the user, to manage multiple grammar projects or even versions, in different folders. That means, it's not easy to make this work with relative paths, but I'm willing to accept a PR to change the implementation, if someone really wants that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/2634#issuecomment-899389335, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABZIE4QJWXQNILOIHLX7HDT5DPVZANCNFSM4IPH6C4A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

tesshucom commented 1 year ago

After bumping the antlr version from 4.11.1 to 4.12.0, compilation on Windows Server started to fail. Doesn't reproduce on Ubuntu. It looks like there was a change in the contents of the header comment between 4.11.1 and 4.12.0 ... ? Sorry, I don't know the reason for this.

// Generated from java-escape by ANTLR 4.11.1
// Generated from com\tesshu\jpsonic\service\upnp\UPnPSearchCriteria.g4 by ANTLR 4.12.0

I came across this issue while doing this research.

It seems like an accidental problem. The \u in \upnp triggers the Invalid Unicode Error. Forced embedding of OS dependent path strings in comments doesn't seem to make sense.

ericvergnaud commented 1 year ago

Looks like your compiler is treating comments as strings ? Is that expected ?Envoyé de mon iPhoneLe 21 févr. 2023 à 07:15, tesshu.com @.***> a écrit : After bumping the antlr version from 4.11.1 to 4.12.0, compilation on Windows Server started to fail. Doesn't reproduce on Ubuntu. It looks like there was a change in the contents of the header comment between 4.11.1 and 4.12.0 ... ? Sorry, I don't know the reason for this. // Generated from java-escape by ANTLR 4.11.1 // Generated from com\tesshu\jpsonic\service\upnp\UPnPSearchCriteria.g4 by ANTLR 4.12.0

I came across this issue while doing this research. It seems like an accidental problem. The \u in \upnp triggers the Invalid Unicode Error. Forced embedding of OS dependent path strings in comments doesn't seem to make sense.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

tesshucom commented 1 year ago

https://github.com/tesshucom/jpsonic/actions/runs/4230187249/jobs/7347252559

It seems to be a problem of JDK compilation specification, not a problem of Java syntax.

    public static void main (String[] args)
    {
        // \u000A System.out.println("Hi");
    }

The above snippet will output Hi.(Unicode conversion is done at compile time. As a result, it will not be considered as a comment.)


It may be a duplicate of #4128. And please understand that it may be a little out of touch with what is being discussed in this issue.

VanceDuncan commented 1 year ago

I'm having this exact issue with code generated by the Antlr 4.12.0 plugin on Windows. Based on the above feedback, it sounds like I might do myself a favor by dropping back down to 4.11.0.

There is a two-phase compilation process in my project: 1) compile the Antlr expression files, which creates dynamically generated code; 2) compile the static and generated sources. Unfortunately, the generated files contain the following header comment on line 1:

// Generated from com\mycompany\mysystem\server\model\Expression.gr by ANTLR 4.12.0

Each of the generated .java files contains the above header. When the compiler attempts to compile the classes, it produces this error:

illegal unicode escape:1

Charming. :(

I can't speak for everyone else on this thread, but it would be nice if the Antlr plugin at least properly escaped the comment. My concern is not that the file contains the source path. My concern is it causes my build process to fail on Windows. That will kill any CI/CD pipelines that run on Windows boxes as well.

Why not replace path separators with dots? as in...

// Generated from com.mycompany.mysystem.server.model.Expression.gr by ANTLR 4.12.0

Not only is that platform-independent, it's also more logical.

VanceDuncan commented 1 year ago

As hoped, dropping back to 4.11.1 resolved this issue. It would be great to see a 4.13 that fixes it. I suggest using dots rather than slashes. My $0.02.

Microsoft should be shot for introducing file systems to backslashes and spaces. Those two things have been the root of almost as much evil as premature optimization.

KvanTTT commented 1 year ago

// Generated from com.mycompany.mysystem.server.model.Expression.gr by ANTLR 4.12.0

Not only is that platform-independent, it's also more logical.

Ordinary slashes seem more logical:

// Generated from com/mycompany/mysystem/server/model/Expression.gr by ANTLR 4.12.0
sebersole commented 1 year ago

Ordinary slashes seem more logical:

The ability to not generate this line as users have been asking for years now seems to be more logical imho.

Thaina commented 1 year ago

I still think the best solution should be generated with relative path to the project folder that was executed

sebersole commented 1 year ago

While that may be true to you, this issue is explicitly about disabling the comment. If you want a specific format, you should probably open a new issue.

sebersole commented 1 year ago

Oops, I guess I misread the subject. Sorry.

But anyway, nothing precludes both. There is no "best" solution. Just the one each of us prefers.

Thaina commented 1 year ago

Yeah I want to say I am the one who open this issue and specifically say disable or format and even mention relative path from the start

Disables is second option if it could be done right of the bat

In my opinion, the generation of path comment was there to serve some purpose of the tools creator so I think we better kept it. But I have seen no reason at all to make it fullpath if it will changed when we sharing the code to team or other machine, it make git workflow become annoying, that was topmost reason I open this issue

VanceDuncan commented 1 year ago

@KvanTTT, to some extent, "logical" is subjective. But it seems to me that file paths are incidental to development environments, whereas package/namespace - such as that represented by com.mycompany.mysystem.server.model.Expression - is environment-independent, and therefore, IMHO, more logical. I'd also be good if the comment was removed altogether, as many others have previously pointed out. Logical or not, that way it would WORK, which reflects the material irritant in this case.

sebersole commented 1 year ago

Just voicing an opinion where the comment is just not useful for me at all. Personally I don't see how supporting a special format "none" is super complicated.

But regardless, as you already said, as long as there is a way to apply some formatting to remove system and temporal details, that ultimately works just as well - specifically, "up-to-date" checks don't get fooled.

tesshucom commented 1 year ago

How about assuming some generalized needs? Specifications that are widely adopted by other products may be adapted to modern environments or contain some best practices.

It is relatively common for code generators to include product information in headers. However, since it is irrelevant to processing, in most cases it can be turned off by setting. When running on CI, we may add a suppression setting to the checker to avoid conflicts with some code checkers. In a strict project, headers of auto-generated code and suppression reasons may be managed at the same time. Like get-out-of-jail card :) There may be some reasons for its existence.

Including path information in artifacts may not be a common pattern anymore. I think that confirmation of the output destination is guaranteed by the log at the time of generation, but depending on the CI, it is very nervous about outputting paths, and even that may be masked. It may be modern mainstream to take care not to disclose path information.

The name of the grammar, from which the source code was generated, is added to the source code. Why's that done at all?

I would like to know this, but the least I can say is that the implementation does not output namespaces, it outputs paths. If the user needs something like a namespace or id and the product substitutes it with a path output, it may not be a cool design.