janreges / siteone-crawler

SiteOne Crawler is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers, DevOps, QA engineers, and consultants. Supports Windows, macOS, and Linux (x64 and arm64).
https://crawler.siteone.io/
MIT License
250 stars 17 forks source link

-1:CON error on internal sites #26

Open BuscheIT opened 3 weeks ago

BuscheIT commented 3 weeks ago

https://github.com/janreges/siteone-crawler/issues/10 describes the same error we are currently trying to resolve.

Our internal network is using a Windows DNS controller - all domain names resolve nicely on both of the test PCs we are using (Win11 and Linux Mint).

On both machines we are getting the -1:CON error and see no requests in the server logs.

Using WIN11 (with 1.0.8 portable) the report has as first line: "Problem with DNS analysis: Crawler\Analysis\DnsAnalyzer::getDnsInfo: nslookup command failed."

The same empty report under linux (Snap) just without the nslookup command error.

All EXTERNAL sites can be scanned without problems - there is no proxy internally, all browsers work internally and we are using plain HTTP to avoid probable certificate trouble.

Any ideas?

janreges commented 3 weeks ago

Hi @BuscheIT,

Information about nslookup failing on Windows is my fault. Nslookup is not available on Windows/CygWin, so this analysis should not even be performed properly on Windows. It's on the roadmap for future fixes.

BuscheIT commented 3 weeks ago

Hello, we are using "http://se.test" without custom port. Putting the "IP hostname" into the hosts file under Win11 does not change Crawler or any browser behaviour. The report with the edited hosts file is at https://crawler.siteone.io/html/2024-10-24/30d/w-8m53kf170vh9-208j5.html

Of course we tried with https as well - our .test environment has that and we just went for http to avoid self signed cert trouble - as stated. Also disabled Windows Defender temporarily as no request seems to even hit the .test server. That it works on any external site makes us bang our head where there could be a problem. We even changed netmasks and made sure any .test traffic goes to the same IP.

Crawler is the first program giving us such trouble - never had problems with other network depending tools, so really would love to figure this out.

Linux screenshot soon - ideas for wireshark or gdb or cmdline options debugging?

PS: Using the standard options after startup but disable "allow images" to reduce requests a little - shouldnt matter.

ovenum commented 2 weeks ago

Just ran into the same error message when trying to crawl a development site on my local machine. This happens on MacOS though, but the error descriptions looks similar to what i am experiencing. The site uses a local domain like website-name.customTLD, where every domain under customTLD is directed to my machine via DnsMasq and will respond to ping requests.

In the generated HTML report under DNS and SSL the following message is shown

image

And for the visited URLs section i will get the -1:CON error status

image