Open xloem opened 4 years ago
Merging #101 into develop will increase coverage by
0.00%
. The diff coverage is25.00%
.
@@ Coverage Diff @@
## develop #101 +/- ##
========================================
Coverage 49.89% 49.89%
========================================
Files 23 23
Lines 1385 1387 +2
Branches 707 709 +2
========================================
+ Hits 691 692 +1
Misses 111 111
- Partials 583 584 +1
Impacted Files | Coverage Δ | |
---|---|---|
...clude/boost/program_options/detail/config_file.hpp | 38.88% <0.00%> (ø) |
|
include/boost/program_options/parsers.hpp | 40.00% <ø> (ø) |
|
src/parsers.cpp | 45.61% <0.00%> (ø) |
|
src/config_file.cpp | 38.02% <33.33%> (+0.34%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 0414abe...e47c6bd. Read the comment docs.
I got an email mentioning that it might be possible to use std::quoted
to resolve this, too. I see the comment is deleted so I infer that they discerned it might not be the right choice, but I'm mentioning it here because I was unaware of std::quoted
.
If somebody else wants to try adding some more options to this for maintainers to choose from in case they have concerns, using escape characters is another option.
There might have also been a concern around the interface for enabling or disabling it.
Hi, yes, sorry about that, that was my fault. As I have written previously, I believe the proposed technique calls for error-prone config files as I am not aware of any interpreter that behaves like this. If you also consider escape character instead (e.g. similarly to \
in Bash), std::quoted
will probably not be suited for that case (as I wrongly believed before). However, you can use a regular expression to parse the line, e.g., ([^=]+)=((?:\\.|[^\\#])*)#{0,1}(.*)
with whitespace trimming should work by a quick test:
#!/usr/bin/env python3
import re
cases = [
(r"key=abcd", (r"key", r"abcd", r"")),
(r" key = abcd # com", (r" key ", r" abcd ", r" com")),
(r" key =abcd#comment", (r" key ", r"abcd", r"comment")),
(r"key =abcd #comment", (r"key ", r"abcd ", r"comment")),
(r" key=abcd\#nocomment", (r" key", r"abcd\#nocomment", r"")),
(r" key = abcd\#nocomment#com", (r" key ", r" abcd\#nocomment", r"com")),
(r" key = abcd\#nocomment###", (r" key ", r" abcd\#nocomment", r"##")),
(r" key = abcd\\#comment", (r" key ", r" abcd\\", r"comment")),
(r" key = abcd\\\#nocomment", (r" key ", r" abcd\\\#nocomment", r"")),
(r" key = abcd\nocomment", (r" key ", r" abcd\nocomment", r"")),
]
rex=r"([^=]+)=((?:\\.|[^\\#])*)#{0,1}(.*)"
for s, v in cases:
assert re.fullmatch(rex, s).groups() == v
print("OK")
That could also be the default and only behavior so you would not need to pass an extra argument. Explanation of the regex:
([^=]+)= key and `=` character
(?: value, i.e., either
\\. any escaped character (including the escape character itself)
[^\\#] or a normal non-escaped non-comment character
)
#{0,1}(.*) comment
Thanks so much for your feedback.
I'm worried adding it as the default behavior could break existing programs when distributions and developers upgrade. There are a lot of users of programs linked to boost out there, a huge config file space.
Do you think there'd be a solution that preserves the behavior for people who happen to have the edge case of 'a=b\#comment
in a pre-existing config file?
Developers wouldn't usually be able to predict if their users were affected by this change, to know whether to require an earlier version or not when upgrading, and it's a small change that would be hard to get everyone to spread preparation for.
You have a good point. However, in my opinion, the edge case a=b\#comment
(i.e., "escaping" a comment sign while still treating the comment as a comment) is against widely known practices and therefore it will be so very rare that this breaking change will probably not affect anyone.
If it bothers you anyway, the next version of program_arguments
could also print a warning that the behaviour regarding this edge case has changed (or will change in the future) if encountered, as some other programs do (e.g., git
, gcc
, etc.), but I would not bother.
I guess @vprus will have to share his opinion with us. :slightly_smiling_face:
I like the warning solution, although I'm thinking of GUI applications where the config files could even be autogenerated.
I won't likely be working further on this PR for quite some time, so it's mostly just for when the work hours are available from someone.
It seems to me a separate options class, or a separate member function to set the parsing options, might be a more robust solution.
Provides for
#
characters in configuration file values. See #100