TLINDEN / Config-General

Generic perl config file parser module
1 stars 1 forks source link

Comment char inside values threated as comment char #3

Open TLINDEN opened 4 weeks ago

TLINDEN commented 4 weeks ago

By mail:

Neven Ivanov wrote (slightly edited to fit gh):

Hi Thomas,

I hope you're doing well.

I’ve been using the Config::General Perl module for parsing Apache config files and have encountered an issue when trying to parse a user virtual host that contains a path with a # character (e.g., DocumentRoot /home/user/#domain.tld/public).

After some investigation, I found the issue appears to be in Config/General.pm on line 678:

# Remove comments and empty lines
s/(?<!\\)#.*$//; # .+ => .* bugfix rt.cpan.org#44600

If I comment out this line, the parsing of user virtual hosts works correctly.

I understand that distinguishing between a # used as a comment and one as part of an actual path can be tricky, but it would be incredibly helpful if a solution could be found to handle this scenario.

Thank you for your time, and I appreciate any guidance or updates on this issue. Wishing you a great day!

Best regards,

Neven Ivanov

TLINDEN commented 4 weeks ago

Well, the idea is to escape comment characters when you intent to use them as content, which it is what the code you commented out does, e.g.:

DocumentRoot /home/user/\#domain.tld/public

Another possibility would be to just quote the string containing the comment char, which however doesn't work as well. I might add this as a feature/fix sometime.

But other than that it's hard to know if a # is meant to be a comment char or not just by using regexps, which the module does. It doesn't parse config files, but uses regexps to extract their values.

Just look at another example:

color = bluedark

vs

color = blue#dark

Now what did the user mean? Is the color supposed to be blue#dark or just blue? That's impossible to answer.

You're working with apache configs. Apache doesn't use regexp to parse its config, it really parses it and it knows the context while doing so. So, apache knows when it aproaches a path, which might contain a #. That's the way it works.

But Config::General doesn't work that way, it has no context at all.

So, my suggestion would be to pre-process such config files and escape the # like so: \# in such cases. Another way would be to get rid of such directory names, these are not good anyway (at least to my liking).

Best, Tom