adobe / aem-core-wcm-components

Standardized components to build websites with AEM.
https://docs.adobe.com/content/help/en/experience-manager-core-components/using/introduction.html
Apache License 2.0
730 stars 736 forks source link

[Form] Avoid Utils.getURL(), instead use Linkchecker/ResourceResolverMapTransformer/... #94

Open henrykuijpers opened 6 years ago

henrykuijpers commented 6 years ago

When rendering the form container component, when determining the form action, the following method is called: this.action = Utils.getURL(request, currentPage); https://github.com/Adobe-Marketing-Cloud/aem-core-wcm-components/blob/d4a9c2c7785f5d50e91dd2ebe145ace4a0bb7491/bundles/core/src/main/java/com/adobe/cq/wcm/core/components/internal/models/v1/form/ContainerImpl.java#L123

I am not happy with this approach:

  1. In case the user enters "test" as a sling:vanityUrl, this value will be used literally; Meaning the tag will be <form action="test"> which will of course go wrong (there is probably a lot more logic involved, than just taking the value and putting it in an HTML attribute)
  2. The form container component (and other components) should not have the responsibility of rendering these mapped/transformed urls, but instead the responsibility should be within the frameworks we already have available (linkchecker, resourceresolvermaptransformer in acs commons, resourceResolver.map() which takes sling:vanityUrls into account)

https://cwiki.apache.org/confluence/display/SLING/Flexible+Resource+Resolution#FlexibleResourceResolution-CurrentStatus

gabrielwalt commented 6 years ago

(CQ-4247358)

Patil2099 commented 4 years ago

@richardhand Can I work on this issue and what method I have to use?

richardhand commented 4 years ago

Hi @Patil2099 - yes you can work on this! Generally if an open issue doesn't have a status label (in progress etc.) then it's open to be worked on.

@henrykuijpers - what would your recommendation be here, shouldn't the linkchecker transformer be able to handle this with its default configuration for form actions? Looks like Utils.getURL was originally added in [0] - @bailescu, was a while ago - but do you have a background to that change?

[0] - https://github.com/adobe/aem-core-wcm-components/commit/d4a9c2c7785f5d50e91dd2ebe145ace4a0bb7491#diff-3d9eac2e73fbc5bcf44be1c3f850f4a2L111

puradawid commented 3 years ago

I took a closer look into this problem (hey, finally decided to contribute here as much as I can, yay!), and I believe this issue pointed by @henrykuijpers is indeed quite problematic, especially because the code at the moment is inconsistent across two places:

I think I understand the reason for getting vanity url in the former case and using map in the latter - it would be sub-optimal to call PageManager and seek for a page, whithout being certain the redirection happens within the content (as it can be outside of the instance entirely).

The first option is to do nothing. This inconsistency can be sustained permanently, given that model's context is very different to servlet's. That being said, it is much harder to rewrite Location header on Dispatcher's level, so it is better to do mapping in the servlet, because the existing tooling will not the transformation easily.

The other option is to remove mapping entirely. This is not a great solution, given that it breaks backward compatibility: it is almost certain there is at least one consumer that expects action attribute to be already a vanity URL, regardless of any other rewriter's settings. I would say the other way around - everyone probably accepted they can't get full, 'non-vanity' link in the action attribute if the vanity URL has been defined.

The last, but not least in my understanding, is to change Page::getVanityUrl() to method that includes this behavior + rewrites any further steps that already have been configured, i. e. ResourceResolver::map(currentPage.getPath()). This will keep the vanity url behavior + will keep the consitency between what is already set up. Probably most of consumers already handled this by setting form[action] to transformation with mapping anywyay.

What's your opinion? Does it sound about right? If that so, I can prepare a PR for it. Appreciate other angles of this problem!

henrykuijpers commented 3 years ago

@puradawid beware that calling resource resolver.map() twice will lead to undesired results. I.e. once in the core components for the form action and once (by default, form:action) in the Link Checker.

I don't remember exactly anymore what it was about, but there were issues with double encoding, if i remember right.

henrykuijpers commented 3 years ago

Also note that it can be undesirable to let POST happen to a vanity url that is defined potentially without .html, since that would also require broader Dispatcher filters to be set up.

(I actually think the POST requests, resulting from filling in the form, should use a specific selector, so that it can be whitelisted in the Dispatcher config. Now, we need to whitelist all POST requests going to a URL with extension .html. And if we allow vanity paths in the action attribute, we will need to allow all POST requests to any URL. I think that is very undesirable, I would argue it to be a no-go in terms of security.)

puradawid commented 3 years ago

Thanks @henrykuijpers for quick reply!

@puradawid beware that calling resource resolver.map() twice will lead to undesired results. I.e. once in the core components for the form action and once (by default, form:action) in the Link Checker.

I don't remember exactly anymore what it was about, but there were issues with double encoding, if i remember right.

I think I need to explore that more, as I thought multiple ResourceResolver::map call in properly set up mappings context will end up with the same result as any other, but you undermined my confidence in this matter (which is a good thing!) :)

Also note that it can be undesirable to let POST happen to a vanity url that is defined potentially without .html, since that would > also require broader Dispatcher filters to be set up.

Ah, I see that now. So, current implementation:

String vanityURL = page.getVanityUrl();
String pageLinkURL;
if (StringUtils.isEmpty(vanityURL)) {
    pageLinkURL = page.getPath() + ".html";
} else {
    pageLinkURL = vanityURL;
}
return map(pageLinkURL);

actually uses the vanity url as-is, as opposed to regular path which may have the extension. I would say this is more problem of the vanity path functionality rather this code.

At this stage, the situation is very tempting to close this issue then.

henrykuijpers commented 3 years ago

I don't think so: I don't think the intention of a vanity URL is to have this also appear in form actions. I believe the primary (and only?) intention of that URL is for content to be reachable that way. Once a user found your page (GET request), I don't think it should be used anymore.

I think it is very undesirable to have this functionality automatically use the vanity URL (that we are not intending to do!).

I would rather have this functionality to work by using some kind of transformer, to translate a content path to a vanity URL afterwards. I.e. the component should just render the content path.

puradawid commented 3 years ago

Ok then, let me check that option then (to not render vanity URL if available).

From what I have done so far, as you said, it makes no sense to map it twice or so (potentially it may fail when there is no host provided in mapping and it does it again for the same path). I do agree that it's ultra weird that this vanity URL is handled outside of the transformer.

henrykuijpers commented 3 years ago

It would be nice if you can also incorporate a dedicated selector for the POST request.

A filter should be handling that request, so that it can send a 403 (for example) when the request is not pointing to a servlet that will handle the form post.

That will make it integrate nicely with the Dispatcher.