erlyaws / yaws

Yaws webserver
https://erlyaws.github.io
BSD 3-Clause "New" or "Revised" License
1.28k stars 267 forks source link

inconsistent handling of missing binding #376

Closed leoliu closed 5 years ago

leoliu commented 5 years ago

For yssi if a binding is missing, the binding name plus the surrounding %% is used. This is not only inconsistent with ssi but also always getting in the way. It gives me a horror reaction whenever I discovered %%XXX%%'s leaked to a page, which could easily happen, for example, switching from ssi to yssi.

I wonder if this can be corrected.

vinoski commented 5 years ago

What version of Erlang/OTP are you using?

Can you provide a snippet of code or page text I can use to reproduce the problem?

leoliu commented 5 years ago

I am on OTP 21.3. But this issue is due to yaws_server:deliver_dyn_file/6 (around line 2976).

Here is an example using an appmod:

out(_Arg) -> {yssi, "test.yaws"}.

The content of test.yaws:

<p> %%TESTBINDING%% </p>
leoliu commented 5 years ago

I think leaking %%BINDING%% is not just inconvenient it might be a security issue as well. Personally I think all server info should be kept secret if possible.

vinoski commented 5 years ago

I took a look at this, and at this stage I disagree that this is a problem with Yaws. Consider a ssi directive that could be used for your example:

{ssi, "test.yaws", "%%", []}

With this directive, the HTML returned to a client, assuming the contents of test.yaws you specified in a previous comment, would be <<"<p> </p>">> because there's no value for TESTBINDING. So far, so good.

As I understand it, you're suggesting that you should be able to change the ssi directive to

{yssi, "test.yaws"}

and get the same HTML result, but I don't understand the basis for that expectation. The yssi directive doesn't deal with delimiters or bindings the way ssi does; the directives are not interchangeable, nor are they designed to be. The ssi directive is for including templated static content, whereas yssi is for including dynamic content.

Unless I'm missing something, I intend to close this issue without change.

leoliu commented 5 years ago

Thanks for looking at this issue.

But yssi does deal with delimiters or bindings though it does other things too that ssi can't. The documentation on {bindings, [{Key1, Value2}, {Key2, Value2} .....]} even says so. Am I missing something?

vinoski commented 5 years ago

I don't think you're missing anything, but from your description, all I can guess is that you're changing ssi to yssi, dropping the delimiter and bindings, and hoping for the same result. Based on my guess, I've explained why you shouldn't expect that result. As always, it's best to provide an actual working example along with your explanation to make the issue as as clear as possible.

The docs for bindings and ssi are slightly different when it comes to variable existence, and I believe this is intentional because ssi specifies variables specifically for a given page, whereas bindings is more general. The bindings description says that it establishes bindings, but doesn't say anything about the handling of random text that happens to exist within the default delimiter. I believe the fact that there's only a default delimiter is the reason why bindings has no choice but to ignore such random text. The ssi description, on the other hand, is more specific in that it says that for any string within the specified delimiters, the delimiters and string are replaced by the specified binding value for that string. The ssi directive gives you much more control.

Changing either of these behaviors would break existing pages. The code for these directives has been around a long time, and that code has operated as it currently does for years. The bindings code in particular specifically creates variables that are examined one at a time, where ssi handles variables as a group — this too explains their respective behaviors.

I'll fix the documentation to clarify what happens in both cases for unspecified variables. But as for getting them to act the same, the best I can propose is to add some sort of optional directive for ssi to tell it to ignore unspecified variables, same as bindings does.

leoliu commented 5 years ago

It looks like I am drawing you to the wrong issue. The core issue isn't changing ssi to yssi. They are different and I learn enough of them from their implementations. (though I keep them as interchangeable as possible but that is the discipline I exercised).

The issue is whether leaving those %%BINDING%% on the page when the binding can't be found is sensible. From my experience (using Yaws on a small-ish website since 2017.3) I haven't encountered a situation where it makes any sense. It deforms the page and leaks info from the server-side to the browser (the latter I consider a risk).

This issue haunted me from day one but I thought maybe there was some rationale behind it so I didn't raise it then.

vinoski commented 5 years ago

Pretty sure the issue is exactly what I described. For bindings, we can substitute only for known variables, but we can't eliminate text that happens to look like a variable. After all, %%BINDING%% is just text if a bindings directive is not in effect, or if there is a bindings directive in effect but it doesn't specify a BINDING variable. Stripping out text that happens to look like a variable, as you're suggesting, would break existing pages.

leoliu commented 5 years ago

I think there may be some missing link on your part.

Random texts that happen to be between %% are not bindings and are output as is (for example %%hello world%%). Bindings are those that can be parsed as such (per yaws_compile:parse_binding/1). So yaws really is spitting out binding variables on the page.

vinoski commented 5 years ago

Your example is test.yaws with these contents:

<p> %%TESTBINDING%% </p>

The yaws_compile:parse_binding/1 function finds TESTBINDING to be a potential variable. That function knows nothing about variable substitution, as bindings aren't available yet at that point. Later, yaws_server:deliver_dyn_file/6 tries to see if that variable has a binding, and when it doesn't, the binding text is left as is.

leoliu commented 5 years ago

I think we are on the same page regarding how it works internally. The difference is in interpreting %%BINDING%% as binding-only or binding and text.

I wonder if it is possible to provide an option to customise it? For example (wishful thinking) missing_binding = Fun and calls Fun(BINDING) to get the expansion for missing binding. This way we can keep backward-compatibility and allow users to opt-out.

leoliu commented 5 years ago

Just a week or two ago to my horror I had a %%THINGIE%% show up on a logout page. I think it is very easy for such a thing to leak out that is worrying.

vinoski commented 5 years ago

The missing_binding = Fun idea would be overkill, but an option to tell bindings to treat any undefined keys as having empty values would solve the problem. I'll post a branch soon.

vinoski commented 5 years ago

Please give the fix-376 branch a try and let me know if you have any feedback.

leoliu commented 5 years ago

Thanks, @vinoski.

I pulled the branch to my project and played with it. A few thoughts.

I think the goal is to get rid of the headache of false-negatives of %%KEY%% being treated as non-binding when they are. For all I can remember I have 0 cases where I use %%TEXT%% as text in a html/yaws source but have multiple known accidents of %%BINDING%% being treated as text.

vinoski commented 5 years ago

I've amended the branch. It now additionally provides strip_undefined_bindings as a server configuration variable. Setting it to true for a server means that all out/1 content returned within that server will have undefined variable text stripped out.

I've kept the strip_undefined setting for {bindings, ...} as well, as I view it as being pretty handy. The fact that your system crashed when you added it is unfortunate but isn't a good reason for me to drop it, given that it's not the default and users must explicitly choose to use it.

leoliu commented 5 years ago

Thanks for the addition. I just tested it and everything seems in order.

leoliu commented 5 years ago

Just discovered it is missing from yaws:set_sc_flags/2.

=ERROR REPORT==== 20-Apr-2019::12:31:43.651406 ===
Unknown and unhandled flag {strip_undefined_bindings,true}
vinoski commented 5 years ago

The set_sc_flags/2 issue is fixed in master, commit 9120d0c.