arvindth commented 5 years ago

This pull request addresses a use case for reading and updating properties files that have:

commented out sample values for key/value entries that need to be preserved. eg:

# Uncomment the following to override the default value
# serverHostname = host.com

sectional comments with whitespace formatting. eg:
```
# 
# Section 1 
#
```

comment1

key = value



Changes:
* Add a preserveFormatting bool option to the loader to allow scanning and retaining whitespace as part of comments.
* Add a preserveFormatting bool option when writing comments to pass through the whitespace formatting to the output.
* Add a virtual key/value for the end of the input to allow trailing comments to be retained and emitted.
* Update lexer to not discard whitespace if preserveFormatting is set.
* Comments are now structs containing 2 elements: the original prefixes from the input, and the comment value itself.

Notes:
* I chose not to add a preserveFormatting option to each of the Load* methods. I felt doing this would explode the number of combinations of potential parameters.
  * Instead, the preserveFormatting option for loading is only available through the GetLoader method followed by explicitly setting the loader options before invoking the underlying Load methods.

arvindth commented 5 years ago

@magiconair could you take a look at this PR and let me know if you'll consider accepting it?

arvindth commented 5 years ago

I've put up an additional commit addressing your comments so far, and also removed some of the extra parameters, like properties.PreserveFormatting. In addition, I think at least some of the complexity comes from having to support writeFormattedCommentWithoutFormattingTests, i.e. being able to take formatted comments and write them out as before by stripping out the whitespace. I'm considering removing or modifying this capability to see if it reduces the complexity, especially in the WriteComment method.

arvindth commented 5 years ago

I've refactored the WriteComment method and moved writing formatted comments out into a new method. The individual commit diff makes it look odd, but looking at the FilesChanged view shows the diff better. This in combination with the previous commit's changes does simplify the original change a bit.

However, in order to maintain backward compatibility in the lexer, the parser and in properties, I don't think that this can be achieved transparently in the lexer. I think that the parser needs to understand preserveFormatting and behave differently when it's specified. Either that, or the list of lexer states would need to grow much larger, since the lexer would need to understand the underlying state quite a bit more, and even then, the parser would still need enhancements to understand and consume those new emitted items.

Let me know if you agree, or have any further suggestions to modify this PR.

However, I also understand if you think it's still not the right approach. If you believe this is not the right direction for this feature, I'm ok with closing this out.

magiconair / properties

Add preserveFormatting option for comments/whitespace #38

comment1