markeverard / POSSIBLE.RobotsTxtHandler

POSSIBLE RobotsTxtHandler is an Episerver CMS plugin that handles the delivery and modification of the robots.txt file
MIT License
6 stars 9 forks source link

Links are rewritten to include "robots.txt" when package is installed #3

Closed thomassvensen closed 10 years ago

thomassvensen commented 10 years ago

We generate some special links in our project that look like this:

http://< host >/nyheter/trysil-tur/Picture

After installing POSSIBLE.RobotsTxtHandler, they suddenly appearing like this:

http://< host >/robots.txt?language=no&node=297Picture

It's probably related to how we build the link, but the issue only surfaces when this package gets installed.

thomassvensen commented 10 years ago

Code review showed that this caused the problem:

   var httpContext = HttpContext.Current;
   var requestContext = new RequestContext(new HttpContextWrapper(httpContext), new RouteData());
   var urlHelper = new UrlHelper(requestContext);
   ProfileImageUrl = GetUrlHelper().PageUrl(pageWithContact) + "Picture" 

Replacing it with this, avoided the issue:

   ProfileImageUrl = pageWithContact.AbsoluteExternalUrl() + "Picture" 

This is the code for extension method "AbsoluteExternalUrl"

        public static string AbsoluteExternalUrl(this PageData p)
        {
            var pageUrlBuilder = new UrlBuilder(p.LinkURL);
            Global.UrlRewriteProvider.ConvertToExternal(pageUrlBuilder, p.PageLink, UTF8Encoding.UTF8);
            var pageUrl = pageUrlBuilder.ToString();
            var uriBuilder = new UriBuilder(HttpContext.Current.Request.Url.GetLeftPart(UriPartial.Authority)) { Path = pageUrl };
            return uriBuilder.Uri.AbsoluteUri;
        }

Apparently, something goes wrong with the RequestContext in the first version.

markeverard commented 10 years ago

Ok, glad you solved/worked around it.

The robots.txt controller is mapped as a standard asp.net mvc route. I'm guessing that creating a requestContext with a blank RouteData is causung the mvc routing system to match the /robots.txt route. Hence the url being rewritten as robots.txt.