Open EvgeniyKo opened 10 years ago
I believe the only reliable solution to this would be emitting the serialized ATN as an embedded resource rather than including it as a string. It's certainly achievable during the build process, but I haven't looked into the specifics.
Considering I've never heard of someone encountering this error with the C# target, can you give some specifics about the size of your lexer? If possible, could you send me a copy of it for further analysis?
lexer is quite small, but there are lots of keywords. Generated lexer size 405 KB
Unfortunately, I can't send grammar to you because of my boss. If you want I can send you the generated lexer.
I have a .g4 that reproduces this problem in in VS 2010. I can send directly to you to diagnose/fix this issue, I don't think it will be ok to post it publicly. What's the best way to get it to you without making it publicly available?
An email address is associated with the Tunnel Vision Laboratories organization here. You can send it to that address. https://github.com/tunnelvisionlabs
Now i've got the same issue in parser. Shall I write another bug?
No it's the same issue. Here are some potential ways to resolve this:
_ATN
field in the parser would then be updated to load the data from the embedded resource instead of from a string literal.+
operators, you might have 5 strings with 1000 +
operators each (the recursion depth in the compiler is bounded by the number of operators in a single expression). The big difference is the Java target's string limit is actually based on a clear definition of limits in the class file format used by the JVM, so there's no question where the limit needs to be in order to ensure all grammars work properly. In the C# target, the limit is an arbitrarily imposed limit which is neither documented nor allowed by the language specification.Considering that the first item is already available and that the overall limit imposed by the earlier compilers is much higher than seen in the Java target, I'm inclined to not make any changes (at least for the time being).
The first option is not possible, because every developer must install the Roslyn compiler, I have to update a build on the build machine, testers must begin smoke testing. All because of one file.
I really like the second option. Can Antlr4 generate the binary file with the serialized ATN?
Embedding a binary resource: Not currently supported; it would be a completely new feature requiring changes to the tool, code generation templates, MSBuild integration, and runtime library.
Splitting the serialized ATN into segments: Currently supported by the tool for the Java target, but would require changes to CSharpTarget.java and to the C# code generation templates.
How long does it take? Maybe I can help resolve this issue.
Splitting the serialized ATN into segments: Currently supported by the tool for the Java target, but would require changes to CSharpTarget.java and to the C# code generation templates.
As far as I understand from the java template, it generates an array of strings and then it calls Utils.join() instead of concatenated string.
I suppose this solution will work for me.
I have fixed the issue. Link to the build with the fix: https://drive.google.com/file/d/0B4sUnvtGhlljalhzQktldE1KdW8/edit?usp=sharing
CSC gives me following error without name of the file, line and column, only project name:
I found the reason for this error:
https://connect.microsoft.com/VisualStudio/feedback/details/785173/got-error-cs1647-an-expression-is-too-long-or-complex-to-compile-in-vs2012
There is a huge string in generated lexer, over 5000 lines, which looks like this:
Any workaround would be very helpful.