Closed j4m3z0r closed 5 years ago
I'm looking into it. To my own surprise it appears there is no support for preserving #region/#endregion
in the output (like, gee, I don't remember not writing the code for that), but the deletion of newlines is certainly a bug.
Awesome. Thank you! Let me know when there's something to test and I'll give it a go.
Fixed in v2.6.2.3.
The problem was that code in EcsPreprocessor
intended to ignore one newline at the end of a preprocessor directive could actually ignore many. I decided the proper fix was to add support for preserving #region/#endregion
and to preserve newlines before and after preprocessor directives. (I found it hard to decide because I thought it might be hard to "deduplicate" newlines in the printer, which for correct output must always print a newline before and after any preprocessor directive even if the syntax tree doesn't have one. However the deduplication turned out to be pretty easy.)
I think there's still some problems here. I just pulled down trunk and compiled (msbuild Loyc-Slim.sln /p:Configuration=Release) and still found some #region
directives missing from the output. Here's a test-case that shows the issue:
#ecs;
internal class TestCase {
#region constructors
replacePP(ConstructorName => Method) {
[DllImport(Constants.LibName)] public static extern IntPtr ConstructorName();
}
#endregion
}
The original version used an unroll outside the replacePP -- this was just the smallest thing I found that triggered it.
It seems like it might be triggered when I use LeMP outside of a method definition -- the invocations inside methods seem to work ok.
I'm also seeing behavior where white-space will affect the output here too -- much like #59. For instance, adding a newline after the #region
line will cause the region line to be emitted, but not the #endregion
line, and I can't seem to get the #endregion
emitted, no matter how much whitespace I add. This also seems to be invariant to whether or not the #ecs;
directive is included at the top of the file.
The moment you said it, I realized what the problem is.
EC# handles a #region
or #endregion
directive the same way as a comment. Comments are associated with the node that they "seem" related to, so in this case both directives will be associated with the replacePP
node. But the replacePP
macro deletes itself from the output (replacing itself with its modified children) and in the process any associated "comments" are lost. If we dangle another node in front of the directives then they don't disappear:
#region constructors
#rawText("");
replacePP(ConstructorName => Method) {
[DllImport(Constants.LibName)] public static extern IntPtr ConstructorName();
}
#rawText("");
#endregion
(#rawText("");
is a special command that emits arbitrary output to the output file, and in this case I'm emitting an empty string - nothing.)
It could be argued that this behavior makes sense for real comments but not so much for #region
.
I wasn't aware of rawText -- I was looking for something similar in the docs but must have missed it.
Regarding treating region the same way as comments, I'd submit that that's probably not quite right: regions have a start and end and enclose an arbitrary sequence of intermediate elements. You'll also get a compiler error if they're mismatched (which is how I found this). They seem to me like they'd be better modeled in the same way as code between curly braces, though perhaps without increasing the indentation level in the output.
I also wanted to understand the behavior of comments: it seems like you're saying that there's two kinds of comments: those that should be emitted to the output, and those that are considered comments on the LeMP code and should be stripped for output, but it also seems like there's no explicit mechanism for expressing which is which -- do I have that right?
For my use case, comments are ultimately a nice to have, so I don't know what priority this deserves, however FWIW, it seems like it would be a nice addition to have some kind of syntax to denote comments that are intended to be written to output vs comments on the LeMP code that should be suppressed. Perhaps the right way to do it would be to assume that C# comment syntax should always be emitted, but add some other piece of syntax for EC# comments that are not intended to be written to the output. I'm unsure if this is how it currently works, but it would be awesome if comments were subject to substitutions the same as any other code when written to the output -- this would facilitate generating docstrings along with the generated code.
Well, #rawText
isn't in the LeMP docs because it is not part of LeMP, it's an intrinsic recognized by the EC# printer (not an important distinction to end users, admittedly)... I'm not sure if it's in any of the docs.
I'm not saying there are two kinds of comments. All comments given to the printer are emitted, but if a node is deleted from the syntax tree then any comments attached to it cannot be printed as they no longer exist. The replacePP
macro itself doesn't appear in the output, so unless replacePP
is specifically programmed to copy its comments into its output, they vanish.
I will think about changing how #region
works.
I posed this question over on Stackoverflow but in the course of investigating it, I think it might be a bug. Here's a minimal test case that shows the issue:
When the region markers are in place, the DllImport + method prototype are emitted like so:
(note: no newline, and #region, #endregion lines are omitted)
Whereas if you comment out the region markers, the newline is preserved:
Though the emitted code still compiles and does what it's supposed to, region markers are super useful in classes large enough to be generated by a macro system. It would be great to have them emitted, and also not affect the whitespace of the enclosed code.