samvera / hyrax

Hyrax is a Ruby on Rails Engine built by the Samvera community. Hyrax provides a foundation for creating many different digital repository applications.
http://hyrax.samvera.org/
Apache License 2.0
183 stars 123 forks source link

Proposal: Change default admin set name to NOT include a slash #1291

Open billdueber opened 7 years ago

billdueber commented 7 years ago

Descriptive summary

Apache's mod_proxy, when reverse proxying, canonicalizes %2F to / in passed URLs (even though it's properly escaped) due to security considerations. URLs with the default admin set id of admin_set/default then break. I'm suggesting we change the default admin set id, and consider restricting legal characters for an admin set ID.

Rationale

As defined in HEAD, the variable DEFAULT_ID for the default admin set is 'admin_set/default'.

https://github.com/samvera/hyrax/blame/master/app/models/concerns/hyrax/admin_set_behavior.rb#L29

This is a problem (annoyance?) in that Apache's mod_proxy cannonicalizes passed URLs prior to sending them off to the proxied server, due to longstanding security issues where someone would protect a path and then some yahoo would pass in a URL with an embedded %2F to get around it.

mod_proxy has an override (nocanon) but it is not on by default and its use may open up a server to other potential security issues.

The mod_proxy docs state:

Normally, mod_proxy will canonicalise ProxyPassed URLs. But this may be incompatible with some backends, particularly those that make use of PATH_INFO. The optional nocanon keyword suppresses this and passes the URL path "raw" to the backend. Note that this keyword may affect the security of your backend, as it removes the normal limited protection against URL-based attacks provided by the proxy.

So, there's a workaround, but if there's not a deliberate reason to include a slash, maybe we could change it.

A potential further step would be to restrict the characters that are allowed in an admin set ID in general, so it doesn't crop up again.

Expected behavior

Default settings don't result in borked URLs on a common platform.

Related work

Link to related tickets or prior related work here.

jcoyne commented 7 years ago

In RIIIF we also have this problem. We advise people to configure Apache to pass the encoded slashes correctly:

https://github.com/curationexperts/riiif#special-note-for-passenger-and-apache-users

mjgiarlo commented 7 years ago

Would like @samvera/hyrax-code-reviewers to chime in on this. We may also want to discuss this in a Samvera Tech call. (Though not next week, because of the conflict with OR2017.)

Thx for raising this, @billdueber

skorner commented 7 years ago

The scenario @billdueber described is not Apache+mod_passenger, but Apache+mod_proxy. There is of course a similar workaround available for mod_proxy (AllowEncodedSlashes NoDecode and ProxyPass <...> <...> nocanon).

The question here seems more about whether it is necessary to introduce such a configuration dependency in hyrax to only support the chosen default admin set name.

jcoyne commented 7 years ago

@skorner I would argue that ModProxy and Apache are not required parts of this stack.

skorner commented 7 years ago

I agree. I'm merely highlighting the fact that the default admin set name choice has consequences in several deployment scenarios.

jcoyne commented 7 years ago

@skorner This seems like bad defaults in Apache to me. Nginx doesn't have this problem. I'm not strongly tied to the default admin set name, but it seems odd to change to support one deployment scenario. Furthermore there are other places you may encounter this bug (riiif for one) and it would make sense to fix the problem rather than eliminate one potential trigger.

elrayle commented 7 years ago

Two questions:

And an observation: If the development cost is low, it is worth considering that Apache is not obscure with very few adopters. I would imagine that a large number of our community make use of Apache. It may not be an official part of the stack, but for a large number of institutions, it is a practical part of the stack.

escowles commented 7 years ago

I'd like to echo @elrayle's concern: is this going to be an issue for existing installations? If not, then I would support removing the / from the default adminset name, since it's a special character in both URLs and POSIX filesystems. Even if this particular problem can be resolved, there are other parts of the stack where a / could cause problems.

I also agree that Apache is far from obscure, and is probably used by a significant fraction of Hydra deployments (possibly a majority). So dismissing it as "one deployment scenario" seems counterproductive here.

jcoyne commented 7 years ago

@elrayle yes, I'm worried that this will just fix it in one place. And I really don't care what we call it.

I think it's telling that NGINX and Apache have different default configurations. Has anyone asked the Apache team why they have it configured the way it is? I'm thinking that a reverse proxy should not be mutating data (so far as that is possible). Just because Apache is a big player doesn't mean they don't have bugs.

svogtunh commented 4 years ago

I would like to chime in on this, as well. I am struggling with this now. I would love for this to be fixed, even if it is considered a work-around for apache users.

no-reply commented 4 years ago

i'd love to see this done, and would be sure to merge a PR that cleanly renames the admin set and provides tooling to properly update existing apps.

svogtunh commented 4 years ago

Even though I found a work-around, which was mentioned in an earlier comment, I still think this should be fixed.

no-reply commented 3 years ago

bumping this to the 3.x series in hopes that we can pick it up post-3.0.0.