Closed helenahedstrom closed 5 years ago
Hi,
I'd speculate that it's not able to make outbound requests due to some kind of security issue. However, without knowing your environment in more detail, I can't really can't guess why exactly. Under the hood it uses the .NET
HttpClient to initiate connections, but if this can't make a request then it would err.
The "format of the URI could not be determined" message would maybe occur if you had some HREFs that were not using a standard protocol. For instance, it can follow http://
links but if, say, there was some links using an unknown protocol like, say, telnet://
or something obscure it might fail like this.
Are you able to check the Trace logs in the /App_Data/Logs/
folder for any more information?
Hi!
I looked in the logs and was able to find some helpful information. I got this:
System.Net.Http.HttpRequestException: Response status code does not indicate success: 401 (Unauthorized).
at System.Net.Http.HttpResponseMessage.EnsureSuccessStatusCode()
at Diplo.LinkChecker.Services.HttpCheckerService.
Hi Helen,
That would indicate that the links aren't able to be checked as they are behind some kind of login that requires authentication. I'm guessing that you need to be authenticated to access the intranet, right? But when you run the link checker it doesn't run with your permissions, it runs in a separate context, and in that context it isn't able to access the pages on the intranet to extract the links from them as it isn't authorised.
Yep, that's right, you need authentication to access the intranet, although when you're logged in on the remote desktop (where all the files are located) no further authentication is needed I think. But I will discuss this with my colleague that is more familiar with this than me. But that's probably it then. I suspect we have to fiddle with the access rights in order for it to work, or do you have another tip?
Thanks so much for your help, it's priceless! :)
/Helena
Hi. Yeah, pretty sure the authentication is the issue. You might be logged in, but the "code" that runs isn't.
It's not something I know too much about, but it may be possible to get code to impersonate another use via editing the web.config - see https://support.microsoft.com/en-us/help/306158/how-to-implement-impersonation-in-an-asp-net-application However, it's not something I know much about or can support.
But let me know if you get anywhere!
Thanks a lot, we will look into that!
Just one more question, the path in the logs "D:\Websites\Umbraco\Packages\Diplo.LinkChecker\Diplo.LinkChecker\Services\HttpCheckerService.cs:line 87", it does not exist on the server. Do you have any idea why it says the cs file is there? Maybe it's nothing odd about it, just curious.
Hi. That's the location where the c# file originally existed when the code was written. So that path is where the code lived on my computer when I wrote it. This information is stored in metadata that is then added to the compiled assembly (DLL) that ends up in the /bin/ folder. It's there primarily to help the original author (ie. me!) to debug the code.
Hi!
Thought you might be interested in hearing how it's been working out. Impersonation was not appropriate for us so we ended up editing your project. We added network credentials to the HttpClientHandler.
Credentials = new NetworkCredential("username", "password")
That worked partially, it would have probably worked perfectly under different circumstances. Maybe you could consider adding something like this to the project? :)
Have a nice day! :)
Hi. Thanks for the update. Yeah, I think it might be a nice addition to add the option to add a username / password for basic authentication. You can even make a pull request to add the feature if you like? ;-)
Hi!
I've been trying out the link checker for a customer. locally on my computer it works like a charm, but when I test it live it searches through all the pages but can't find any links. It gives a success message, "0 links searched, 0 errors found". Also on some pages the "500 uri could not be determined" shows up. The site is located on a remote desktop and is an intranet. When I log in on the remote desktop in Umbraco I do that via localhost so I figured it should work the same as when I try locally.
I then start to wonder if this has to do with some authentication problem or something else?
Thankful for answers!